lod3n Posted August 3, 2007 Share Posted August 3, 2007 I know you can do this with the COM object Shell.Application, but that's not the point of this demo.I was curious as to the specifics of where NTFS metadata was actually stored. This is the information you see when you right click on a file, go to properties, and then Summary, such as Keywords, Title, Categories, etc.It seems that Microsoft is not storing the information in the file itself, but rather in a binary file attached to the file via an Alternate Data Stream. The following is a good primer on ADS, and even gives a quick primer on what this script is actually doing:http://members.cox.net/slatteryt/Streams.htmlThe following link is a technical analysis of the binary files that Windows is using to store the metadata:http://sedna-soft.de/summary-information-stream/I have never dissected and reassembled a binary file like this before, so I am sure my methodology is less than ideal. If you understand what I am doing here, and have some suggestions, I am all ears. The file starts with some headers that point to sections, and the sections point to data blocks, each of which seem to be null terminated. I can't explain it any further than that. The end goal I have in mind for this is to accurately disassemble the 2 metadata files, and then rebuild them, and reapply them to the target file. Eventually this will evolve into a _FileSetMetadata UDF, but I have a long way to go. Microsoft provides no documentation for this, which is kind of cool, in that I seem to be doing something unique.You must run this in Scite, as it outputs the the Console.expandcollapse popup#include <string.au3> $message = "Select a file that has Metadata properties set" $filename = FileOpenDialog($message, @DesktopDir & "\", "All (*.*)", 1) If @error Then MsgBox(4096,"","No File(s) chosen") Exit EndIf ; ADS file containing most extended properties $ads1 = $filename&":"&Chr(5)&"SummaryInformation" ; ADS file containing the category information $ads2 = $filename&":"&Chr(5)&"DocumentsummaryInformation" If Not FileExists($ads1) Then MsgBox(16,"Error","The file: " & @CRLF & " " & $filename & @CRLF & "does not have any metadata properties set. "&@CRLF & @CRLF&"Please choose a different file") Exit EndIf $propType_1 = 2 $propType_2 = 30 $propType_3 = 30 $propType_4 = 30 $propType_5 = 30 $propType_6 = 30 $propType_7 = 30 $propType_8 = 30 $propType_9 = 30 $propType_10 = 64 $propType_11 = 64 $propType_12 = 64 $propType_13 = 64 $propType_14 = 3 $propType_15 = 3 $propType_16 = 3 $propType_17 = 71 $propType_18 = 30 $propType_19 = 3 $propDesc_1 = "Code page" $propDesc_2 = "Title" $propDesc_3 = "Subject" $propDesc_4 = "Author" $propDesc_5 = "Keywords" $propDesc_6 = "Comments" $propDesc_7 = "Template" $propDesc_8 = "Last Saved By" $propDesc_9 = "Revision Number" $propDesc_10 = "Total Editing Time" $propDesc_11 = "Last Printed" $propDesc_12 = "Create Time/Date" $propDesc_13 = "Last Saved Time/Date" $propDesc_14 = "Number of Pages" $propDesc_15 = "Number of Words" $propDesc_16 = "Number of Characters" $propDesc_17 = "Thumbnail" $propDesc_18 = "Name of Creating Application" $propDesc_19 = "Security" ;16 = Force binary(byte) reading and writing mode with FileRead and FileWrite ;32 = Use Unicode UTF16 Little Endian mode when writing text with FileWrite and FileWriteLine (default is ANSI) $word = 2 $dword = 4 $byte = 1 $guid = $dword + ($word * 2) + ($byte * 2) + ($byte * 6) $file = FileOpen($ads1, 16+32) If $file = -1 Then MsgBox(0, "Error", "Unable to open file.") Exit EndIf Global $data = Hex(FileRead($file)) FileClose($file) ; read header $byteOrderMarkForUTF16LE = Chomp(0x00,$word) ConsoleWrite("$byteOrderMarkForUTF16LE: " & $byteOrderMarkForUTF16LE & @CRLF) $streamValidation = Chomp(0x02,$word) ConsoleWrite("$streamValidation: " & $streamValidation & @CRLF) $unknownPurpose = Chomp(0x04,$word) ConsoleWrite("$unknownPurpose: " & $unknownPurpose & @CRLF) $OSIndicator = Number(LEDec(Chomp(0x06,$word))) ConsoleWrite("$OSIndicator: " & $OSIndicator & @CRLF) $streamClassID = Chomp(0x08,$guid ) ConsoleWrite("$streamClassID: " & $streamClassID & @CRLF) $sectionCount = Dec(LEDec(Chomp(0x18,$dword))) ConsoleWrite("$sectionCount: " & $sectionCount & @CRLF) ; read section declarations $sect1ClassID = LEDec(Chomp(0x1c,$guid)) ; not sure this should be LE decoded ConsoleWrite("$sect1ClassID: " & $sect1ClassID & @CRLF) $sect1Offset = Dec(LEDec(Chomp(0x2c,$dword))) ConsoleWrite("$sect1Offset: " & $sect1Offset & @CRLF) ; read first section header $sect1Length = Dec(LEDec(Chomp($sect1Offset+0x00,$dword))) ConsoleWrite("$sect1Length: " & $sect1Length & @CRLF) $sect1PropCount = Dec(LEDec(Chomp($sect1Offset+0x04,$dword))) ConsoleWrite("$sect1PropCount: " & $sect1PropCount & @CRLF) ; read Property declarations $cursor = 0x04 For $i = 1 To $sect1PropCount ConsoleWrite("------------------" & @crlf) ; read property ID and Offset from Section Header $cursor += $dword $PropId = Dec(LEDec(Chomp($sect1Offset+$cursor,$dword))) $cursor += $dword $PropOffset = Dec(LEDec(Chomp($sect1Offset+$cursor,$dword))) $realPropOffset = $sect1Offset+$PropOffset $propType = PropGetType($propID) $propDesc = PropGetDesc($propID) Switch $propType Case 2 ConsoleWrite("2 byte signed integer" & @CRLF) Case 3 ConsoleWrite("4 byte signed integer" & @CRLF) Case 30 ConsoleWrite("null-terminated string prepended by dword string length" & @CRLF) Case 64 ConsoleWrite("Filetime (64-bit value representing the number of 100-nanosecond intervals since January 1, 1601)" & @CRLF) Case 71 ConsoleWrite("Clipboard format" & @CRLF) Case Else ConsoleWrite("Unknown Type" & @CRLF) EndSwitch ConsoleWrite($i & " $PropId: " & $PropId & @CRLF) ConsoleWrite($i & " $realPropOffset: " & $realPropOffset & @CRLF) ConsoleWrite($i & " $propType: " & $propType & @CRLF) $ptype = Dec(LEDec(Chomp($realPropOffset,$dword))) ; always 31? WTF? ConsoleWrite("$ptype = " & $ptype & @CRLF) $plen = Dec(LEDec(Chomp($realPropOffset+$dword,$dword))) ConsoleWrite("$plen = " & $plen & @CRLF) $pdataLoc = $realPropOffset+$dword+$dword $pcursor = 0 $propStringValue = "" While 1 $char = Dec(LEDec(Chomp($pdataLoc+$pcursor,$word))) If $char = "0000" Then ExitLoop $propStringValue &= Chr($char) $pcursor += $word WEnd ConsoleWrite("! " & $propDesc & ": " &$propStringValue & @CRLF) Next ;read a number of hex characters from a given offset point Func Chomp($offset,$length) ;ConsoleWrite($offset & ": ") $charPosition = ($offset*2)+1 $charLen = $length*2 Return StringMid($data,$charPosition,$charLen) EndFunc ; little endian decoder Func LEDec($hexstring) Local $output = "" For $i = 1 To StringLen($hexstring) Step 2 $output = StringMid($hexstring,$i,2) & $output Next Return $output EndFunc Func PropGetType($propID) If IsDeclared ("propType_"&$propID) Then Return Eval("propType_"&$propID) Else Return "" EndIf EndFunc Func PropGetDesc($propID) If IsDeclared ("propDesc_"&$propID) Then Return Eval("propDesc_"&$propID) Else Return "" EndIf EndFunc Exit ConsoleWrite("---------------------" & @CRLF) $count = 0 While StringLen($data) > 0 $char = StringLeft($data,2) $data = StringTrimLeft($data,2) $ascii = StringStripCR(_HexToString($char)) ConsoleWrite($count & @TAB & $char & @TAB & $ascii & @CRLF) $count += 1 WEnd [font="Fixedsys"][list][*]All of my AutoIt Example Scripts[*]http://saneasylum.com[/list][/font] Link to comment Share on other sites More sharing options...
ptrex Posted August 3, 2007 Share Posted August 3, 2007 @lod3n Very interesting concept. I tried several files but non returned any Meta Data ? Always a message saying : This file has no meta data. regards ptrex Contributions :Firewall Log Analyzer for XP - Creating COM objects without a need of DLL's - UPnP support in AU3Crystal Reports Viewer - PDFCreator in AutoIT - Duplicate File FinderSQLite3 Database functionality - USB Monitoring - Reading Excel using SQLRun Au3 as a Windows Service - File Monitor - Embedded Flash PlayerDynamic Functions - Control Panel Applets - Digital Signing Code - Excel Grid In AutoIT - Constants for Special Folders in WindowsRead data from Any Windows Edit Control - SOAP and Web Services in AutoIT - Barcode Printing Using PS - AU3 on LightTD WebserverMS LogParser SQL Engine in AutoIT - ImageMagick Image Processing - Converter @ Dec - Hex - Bin -Email Address Encoder - MSI Editor - SNMP - MIB ProtocolFinancial Functions UDF - Set ACL Permissions - Syntax HighLighter for AU3ADOR.RecordSet approach - Real OCR - HTTP Disk - PDF Reader Personal Worldclock - MS Indexing Engine - Printing ControlsGuiListView - Navigation (break the 4000 Limit barrier) - Registration Free COM DLL Distribution - Update - WinRM SMART Analysis - COM Object Browser - Excel PivotTable Object - VLC Media Player - Windows LogOnOff Gui -Extract Data from Outlook to Word & Excel - Analyze Event ID 4226 - DotNet Compiler Wrapper - Powershell_COM - New Link to comment Share on other sites More sharing options...
Fabry Posted August 3, 2007 Share Posted August 3, 2007 Me too. How can I add metadata at file? A lan chat (Multilanguage)LanMuleFile transferTank gameTank 2 an online game[center]L'esperienza è il nome che tutti danno ai propri errori.Experience is the name everyone gives to their mistakes.Oscar Wilde[/center] Link to comment Share on other sites More sharing options...
lod3n Posted August 3, 2007 Author Share Posted August 3, 2007 (edited) Right click on the file, and select Properties. Then click on the Summary Tab. Add text to the Title, Subject, Author, Category, Keywords and Comments fields and click OK. Not all files provide a Summary tab, for some reason. Also you need to be running NT or later, and your hard drive must already be NTFS formatted. Edited August 3, 2007 by lod3n [font="Fixedsys"][list][*]All of my AutoIt Example Scripts[*]http://saneasylum.com[/list][/font] Link to comment Share on other sites More sharing options...
flyingboz Posted September 7, 2007 Share Posted September 7, 2007 @lod3n,Have you thought about extending the concept to setting these properties, in addition to reading them? Reading the help file before you post... Not only will it make you look smarter, it will make you smarter. Link to comment Share on other sites More sharing options...
lod3n Posted September 10, 2007 Author Share Posted September 10, 2007 Yes I have thought about that: "The end goal I have in mind for this is to accurately disassemble the 2 metadata files, and then rebuild them, and reapply them to the target file. Eventually this will evolve into a _FileSetMetadata UDF, but I have a long way to go." My attempts so far have not generated good results, but once they do, I will post something. The problem is I have no motivation to work on this right now, as I have no project that would benefit from it. [font="Fixedsys"][list][*]All of my AutoIt Example Scripts[*]http://saneasylum.com[/list][/font] Link to comment Share on other sites More sharing options...
smashly Posted September 10, 2007 Share Posted September 10, 2007 Nice idea in the way of using ads for extended properties on files with a autoit based udf.I used this page to get an idea about using ADS http://www.irongeek.com/i.php?page=security/altdsI found the link to that page in this thread http://www.autoitscript.com/forum/index.ph...5222&hl=ADSThe only thing that sorta turned me off using ads more was the idea of loosing the ads data if if I copied my file to a fat32 drive (eg: memory stick).Cheers Link to comment Share on other sites More sharing options...
Uten Posted September 12, 2007 Share Posted September 12, 2007 This article at Desaware should give insight and keywords to search for for anyone interested in MS alternate data streams. Dan Appleman and Desaware was quite early at exploring and creating support for this "technology". But the 1.0 component was quite buggy. Burned my fingers a bit on that one.. Please keep your sig. small! Use the help file. Search the forum. Then ask unresolved questions :) Script plugin demo, Simple Trace udf, TrayMenuEx udf, IOChatter demo, freebasic multithreaded dll sample, PostMessage, Aspell, Code profiling Link to comment Share on other sites More sharing options...
lod3n Posted September 12, 2007 Author Share Posted September 12, 2007 I have the ADS portion of this is all sorted out, that's not the difficulty at all. What I struggle with is the issue of manipulating the actual hex in the binary metadata files themselves. It's not like they're INIs or something, and the 3rd party documentation that I linked to above is not 100% accurate in describing the contents of the two files. If anyone is good at decrypting proprietary undocumented binary file formats from Microsoft, please take a look at how this works, and let me know if you have any suggestions or improvements. If the fact that the binary files are stored in ADS is problematic for anyone, here is a small function to extract them so you can hack on them in your favorite hex editor: $filename = @ScriptDir&"\file.flv" $AdsSrc = $filename &":"&Chr(5)& "SummaryInformation" $BinTarget = $filename & "_SummaryInformation.bin" _ExtractAdsFile($AdsSrc,$BinTarget) $AdsSrc = $filename &":"&Chr(5)&"DocumentsummaryInformation" $BinTarget = $filename & "_DocumentsummaryInformation.bin" _ExtractAdsFile($AdsSrc,$BinTarget) Func _ExtractAdsFile($src,$target) Local $fhSrc = FileOpen($src, 16+32) If $fhSrc = -1 Then ConsoleWrite("! Could not open " & $src & " for reading" & @CRLF) Return False EndIf Local $data = FileRead($fhSrc) FileClose($fhSrc) Local $fhTarg = FileOpen($target, 16+32+8+2) If $fhTarg = -1 Then ConsoleWrite("! Could not open " & $target & " for writing" & @CRLF) Return False EndIf FileWrite($fhTarg,$data) FileClose($fhTarg) EndFunc Attached are the extracted metadata files that this function produced, and a screenshot of the data that they contain.metadata.zip [font="Fixedsys"][list][*]All of my AutoIt Example Scripts[*]http://saneasylum.com[/list][/font] Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now