Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

Identifying files in the terminal UNIX
Ever visited one of those sites that downloads a file with a Javascript or some other active method, and you wind up with a file named "download.asp" or "file.asp"? Ever wonder how to figure out what it is, without trying to drag-and-drop it on everything?

In the terminal, it's quite simple - you can use the file command, like this:
[xperiment:~/Documents/downloads] berto% file download.php
download.php: gzip compressed data, deflated,
last modified: Fri Feb 23 18:17:34 2001, os: Unix
(Line break added to shorten the line width!) The file command looks at the file, and compares it to a database of types, and then gives you its best guess at the filetype.

In this real-world example, I couldn't figure out how to expand the file ... the file output lets me know I need to use gzip! For full information on file, make sure you check out the manual pages by typing man file in the terminal window.
    •    
  • Currently 3.00 / 5
  You rated: 1 / 5 (4 votes cast)
 
[5,416 views]  

Identifying files in the terminal | 1 comments | Create New Account
Click here to return to the 'Identifying files in the terminal' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Handy find | xargs Construct
Authored by: MartySells on Sep 26, '04 04:18:36PM
The UNIX file command uses a "magic" set of data (/usr/share/file/magic) to identify file types. This is much smarter than expecting file extensions or Mac file types to identify file content. For instance, a JPEG may be named .JPG, .jpg or .jpeg or .JPEG.

Some examples: The following command will look for MP3 files:
    find . -type f -print0 | xargs -0  file | grep MP3
The first argument to find, . (dot), is the starting directory. A similar bit for finding JPEGs:
    find . -type f -print0 | xargs -0  file | grep JPEG
The find FIND_EXPRESSIONS -print0 | xargs -0 COMMAND is a nice construct that can be used for other things besides file. Some examples:
    find fed/ -name '*.xls' -print0 | xargs -0 zip ./fedfiles.zip
will find all *.xls files below the fedl/ directory and zip them into fedfiles.zip.
    find fed/ -type f -print0 | xargs -0 du -sk | sort -n | tail -5
will show the five largest files below fed/

Two key features with using the find -print0 | xargs -0 approach are that the command you specify will get multiple filenames per invocation and that it's "safe" with filenames containing special characters like spaces and quotes.

Using a small example we can show that calling zip once with all filenames is faster than invoking it once per filename:
$ time for i in fed/* ; do zip -q test1.zip "$i" ; done
real    0m0.232s
user    0m0.130s
sys     0m0.090s
eyeBook:~ msells$ time find fed/ -type f -print0 | xargs -0 zip -q test2.zip
real    0m0.160s
user    0m0.100s
sys     0m0.030s
And also that an example of not using this technique which fails on filenames with spaces:
$ md5 -r `find fed -type f -print`
e8a731935dd19a18d7c2583ee14cd2b8 fed/269block.xls
030b4bf1ddd17de9131d54b5ddd52b7d fed/288BLOCK.XLS
5d57958ecb970e09b56006b0219bc9e1 fed/358BLK1.xls
d14d3c649673ee1db249491da5ce6f0b fed/681block.xls
md5: fed/743: No such file or directory
md5: block: No such file or directory
md5: South: No such file or directory
md5: Point.xls: No such file or directory
Many Unix commands support a null terminated list of filenames from find -print0 as input. While the syntax of xargs requires a bit of learning (RTFM) it is a very powerfull tool!
-m

[ Reply to This | # ]