wvzuilen Posted August 31, 2012 Posted August 31, 2012 Hi all, I've got a huge xml file I want to edit automatically. Over 10 million rows. First I used a for-next loop with FileReadLine, that took ages... of course... Then I tried reading the file to a array, editing the array by replacing certain values and then wrote the array back to a xml file. That worked. Much faster then editing the xml file itself. expandcollapse popup#include <File.au3> #include <Array.au3> $filename = "OUTPUT_OPDRACHTEN_20120411_221538.xml" Local $aArray ConsoleWrite("Reading file..." & @LF) _FileReadToArray($filename, $aArray) ConsoleWrite("Reading file... Ready!" & @LF) $lines = UBound($aArray) $count = 0 ConsoleWrite("Changing..." & @LF) For $i = 0 To $lines - 1 If StringInStr($aArray[$i], "<BC_INCASSO>") > 0 Then $aArray[$i] = "<BC_INCASSO xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" xsi:nil=""true""/>" $count = $count + 1 ;ConsoleWrite($i & @LF) EndIf If StringInStr($aArray[$i], "<OS_OMA_ONDW_OMA>Wijziging</OS_OMA_ONDW_OMA>") > 0 Then $aArray[$i] = "<OS_OMA_ONDW_OMA>Verlenging</OS_OMA_ONDW_OMA>" EndIf Next ConsoleWrite("Changing... Ready!" & @LF) ConsoleWrite("Writing file" & @LF) $new = FileOpen("new.xml", 129) For $i = 1 To $lines - 1 FileWriteLine($new, $aArray[$i]) Next FileClose($new) ConsoleWrite("Writing file... Ready!" & @LF) ConsoleWrite($count & @LF) But now, the next step.... I would like to insert a few rows of code in certain places of the array, but that takes ages again. I guess that's because when I add a new value (_ArrayInsert) on let's say index 5 and it's a array with 10.000.000 values, it has to re-index all values below that new value. If StringInStr($aArray[$i], "") > 0 Then $aArray[$i] = "ID1" _ArrayInsert($aArray,$i+2,"") _ArrayInsert($aArray,$i+3,"ID2") _ArrayInsert($aArray,$i+4,"") ConsoleWrite($i & @LF) EndIf Does anybody has a idea how I can do this reasonably fast ? Greetings.
jdelaney Posted August 31, 2012 Posted August 31, 2012 (edited) Use the XMLDOM objecthttp://www.w3schools.com/dom/default.aspexample of usage:$oXML=ObjCreate("Microsoft.XMLDOM") $stest = @DesktopDir & "xml1.xml" $oXML.load($stest) ; load document $result = $oXML.selectSingleNode('//b="Today"') ConsoleWrite ( $result.xml & @CRLF) ConsoleWrite ( $result.childNodes.item(1).text & @CRLF) ConsoleWrite ( $result.childNodes.item(0).text & @CRLF)there is also selectnodes, so if you have a generic structure, and want to add a node, you can use that one, and then loop through the returned collection of objectsHere is an example of adding a child node of edition to ALL instances of ANY <Book>newel=xmlDoc.createElement("edition") x=xmlDoc.getElementsByTagName("book") x.appendChild(newel) Edited August 31, 2012 by jdelaney IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
skin27 Posted August 31, 2012 Posted August 31, 2012 If you have such a large xml document there is no way to make your script fast and without errors by editing it like a string/text file. You really need standards made for xml (xpath/xquery).I once needed to transform an xml document with 5 million rows with xslt. I found xmldom to slow and ended up calling Saxon (http://saxon.sourceforge.net/) from command line. Another option is using an xml database. For BaseX 10 mil is nothing (http://basex.org). Combining it with Autoit you either need to call it from command line or use the Java UDF.So I would first try jdelaney suggestions and if this doens't work for you then try some of my suggestions.
wvzuilen Posted August 31, 2012 Author Posted August 31, 2012 Thanks for your suggestions. I'll try the XML DOM first.
Imbuter2000 Posted June 3, 2013 Posted June 3, 2013 skin27 can I ask why you ended up choosing Saxon over Basex as a command-line XML parser?
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now