What is XMagic?
XMagic is an alternate implementation of file typing based on Magic
numbers or tests. XMagic was inspired by the venerable file(1)
command.
Where can I get the current XMagic file?
The current XMagic file is available here.
Why should I use file typing via XMagic?
Baselining a system prior to deploying it is a good security practice
because the information collected may be used to determine what
directories and files have changed since deployment. If a prior
baseline doesn't exist and the the system is compromised, the process
of determining change becomes much harder. One approach is to
construct and baseline a virgin system that approximates the original
system. A snapshot of the compromised system can then be compared to
the virgin baseline. This technique could, in theory, reduce the
number of files that must be reviewed to a very small set. In
practice, this technique is good for eliminating system files that
typically do not and should not change. Unfortunately, the number of
files that aren't eliminated by this technique is usually quite large.
So large, in fact, that some additional technique is needed to
prioritize the list of unknown/suspicious files into manageable
subsets. This is where file typing via XMagic can help. By sorting
the list of unknown/suspicious files by type, the practitioner can
focus on groups that are relevant to the investigation such as scripts
and executables.
In a workbench environment file type information can be collected
using existing tools (e.g., the standard file(1) command). In an
operational environment it's more efficient and less invasive (think
timestamps) to collect all desired attributes, including file type, in
a single pass. This drove the decision to add Magic support to
FTimes.
What is the format of XMagic?
The best way to learn about XMagic is to start by reading the magic(5)
man page, which is available here.
Another good resource, is the Magic file that ships with file(1);
study it to understand how Magic works in practice. The Magic file
that shipped with file(1) 4.17 can be found here.
After you've done that, take a look at the next question.
What are the diferences between XMagic and Magic?
Since XMagic support was built from scratch, it requires a modified
form of the Magic file that ships with file(1). The most significant
differences between XMagic and Magic are as follows:
- The test operator/value pair has been split into separate fields.
- XMagic supports regular expression Magic via Perl Compatable
Regular Expressions (PCRE). The associated test operators are as
follows:
=~ The expression must match
!~ The expression must not match
XMagic supports block-based entropy calculations. The associated
test types are: row_entropy_1 and row_entropy_2
XMagic supports block-based average calculations. The associated
test types are: row_average_1 and row_average_2
XMagic supports block-based percent calculations for various
ctype(3) character classes. The associated test types are:
percent_ctype_alnum, percent_ctype_alpha, percent_ctype_ascii,
percent_ctype_cntrl, percent_ctype_digit, percent_ctype_lower,
percent_ctype_print, percent_ctype_punct, percent_ctype_space, and
percent_ctype_upper
XMagic supports block-based hash calculations. The associated test
types are: md5 and sha1
XMagic also supports several different test operators for all of
its block-based tests. These operators are listed and described
here:
[] (greater than or equal to) and (less than or equal to)
[) (greater than or equal to) and (less than)
(] (greater than) and (less than or equal to)
() (greater than) and (less than)
][ (less than or equal to) or (greater than or equal to)
]( (less than or equal to) or (greater than)
)[ (less than) or (greater than or equal to)
)( (less than) or (greater than)
Currently, several file(1) types/operators are not supported by
XMagic. Some of the unsupported types (e.g., string/[Bbc] and
search/<number>) are not necessary because equivalent Magic
incantations can be crafted using regular expressions. However, we
are planning to implement support for missing types/operators where it
makes sense to do so.
While the test operator/value difference is minor, it does remove
ambiguities (e.g., '!<arch>'), simplifies parser code, and
allows operators to exceed file(1)'s one-character length restriction.
In the case where the 'x' operator has been specified (meaning there
is no test to perform), a single hyphen, '-', is inserted in the value
field to act as a place holder. The following example shows where to
insert the implied test operator. Note that if a test operator was
not supplied in the standard Magic description, the implied operator
is '='.
Magic: 0 string \037\235 compress'd data
XMagic: 0 string = \037\235 compress'd data
This example shows where to insert the place holder when the test
value is to be ignored:
Magic: >6 byte x type %c
XMagic: >6 byte x - type %c
The next two examples show how to convert a series of string/[Bbc]
tests to equivalent regexp tests:
Magic: 0 string/B = \=pod\n Perl POD document
Magic: 0 string/B = \n\=pod\n Perl POD document
Magic: 0 string/B = \=head1\ Perl POD document
Magic: 0 string/B = \n\=head1\ Perl POD document
Magic: 0 string/B = \=head2\ Perl POD document
Magic: 0 string/B = \n\=head2\ Perl POD document
XMagic: 0 regexp =~ ^\n?=(?:pod\n|head[12]) Perl POD document
Magic: 0 string/cB = \<DOCTYPE\ html HTML document text
Magic: 0 string/cb = \<head HTML document text
Magic: 0 string/cb = \<title HTML document text
Magic: 0 string/cb = \<html HTML document text
XMagic: 0 regexp =~ (?i)^\s*<DOCTYPE[\x20\t]+html|head|html|title) HTML document text
This example shows how to convert a search/<number> test to an
equivalent regexp:<number> test (Note: the current maximum
<number> for XMagic is 128):
Magic: 0 search/20 = foo The venerable %s document
XMagic: 0 regexp:20 =~ foo The venerable %s document
This example shows how to use the block-based test types to harvest
various topographical information:
XMagic: 0 byte x - 512
XMagic: >&0 row_entropy_1:512 x - \b|%f
XMagic: >&0 row_entropy_2:512 x - \b|%f
XMagic: >&0 row_average_1:512 x - \b|%f
XMagic: >&0 row_average_2:512 x - \b|%f
XMagic: >&0 percent_ctype_alnum:512 x - \b|%f
XMagic: >&0 percent_ctype_alpha:512 x - \b|%f
XMagic: >&0 percent_ctype_ascii:512 x - \b|%f
XMagic: >&0 percent_ctype_cntrl:512 x - \b|%f
XMagic: >&0 percent_ctype_digit:512 x - \b|%f
XMagic: >&0 percent_ctype_lower:512 x - \b|%f
XMagic: >&0 percent_ctype_print:512 x - \b|%f
XMagic: >&0 percent_ctype_punct:512 x - \b|%f
XMagic: >&0 percent_ctype_space:512 x - \b|%f
XMagic: >&0 percent_ctype_upper:512 x - \b|%f
XMagic: >&0 sha1:512 x - \b|%s
XMagic: >&0 md5:512 x - \b|%s
XMagic does not support signed comparisons -- all integer comparisons
are unsigned. As such, the parser does not recognize the 'u' prefix
on the data type field. An example of this prefix can be found in the
tcpdump Magic.
Magic: 0 ubelong 0xa1b2c3d4 tcpdump capture file (big-endian)
XMagic: 0 belong = 0xa1b2c3d4 tcpdump capture file (big-endian)
How do I use XMagic?
XMagic is supported in both map and dig modes of operation. However,
the usage is slightly different.
- To use XMagic in mapauto mode, place the xmagic file in the
current working directory or /usr/local/ftimes/etc/xmagic
(c:\ftimes\etc\xmagic for MS Windows).
- To use XMagic in map{lean,full} modes, you can specify an
alternate location using the MagicFile control. To change the
predefined XMagic location, edit the value for XMAGIC_DEFAULT_LOCATION
in xmagic.h, and recompile.
- To use XMagic in dig{auto,lean,full} modes, you must assign the
path of the XMagic file to the DigStringXMagic control. Note that
XMagic is not strictly limited to block typing in dig mode. It can
also be used to harvest various topographical information and
enumerate well-known structures.
Why does XMagic sometimes report different results than file(1)?
Except for special files, FTimes does not support any built-in Magic.
Because of this, FTimes will often report 'unknown' where file(1)
reports something like 'ASCII English text'. The main reason for this
is that built-in Magic is not based on Magic tests. Rather, it's
based on logic that scans the input buffer (character by character)
and attempts to make an educated guess as to what the underlying data
"looks" like. Currently, whether to support built-in Magic or not is
an open question.
What is the future of XMagic?
Currently, XMagic is in a transitional state. We're looking for a
better way to standardize existing tests. We're also looking for a
way to abbreviate Magic descriptions. Two ideas that have come up are
a unique numbering scheme and a MIB like structure. In either case,
we think that there should be a unique mapping between a particular
Magic test and its identifier.
Along those same lines, it would be nice if the user could specify the
level of detail that will be provided upon a match. For example,
sometimes it would be sufficient to know that a file (e.g.,
aliases.db) is a "Berkeley DB Hash file" instead of "Berkeley DB Hash
file (Version 2, Little Endian, Bucket Size 8192, Bucket Shift 13,
Directory Size 256, Segment Size 256, Segment Shift 8, Overflow Point
1, Last Freed 2, Max Bucket 1, High Mask 0x3, Low Mask 0x1, Fill
Factor 65536, Number of Keys 20)".
Another planned change is to move to URL encoding for string values.
The current format allows for escapes (e.g., '\ ' for a space) and
octal character representations. This makes parsing strings more
complex than it needs to be, and has led to broken Magic incantations.
|