Jump to content
Sign in to follow this  
argumentum

msWord XML beautifier

Recommended Posts

argumentum

..so I got Notepad++ but was not working with the xml printing plugin, the online beautifiers would no work all the time, and I wanted like Tidy does with tabs so it looks nice in Scite to look at the code and make my own template for a script I'm working on....very frustrating and time consuming, so, I put this code together. I get in a 10th of a sec. the xml prettified vs. minutes and frustration.

Anyway, how I use it is I drag and drop the word doc. ( saved in XML format ) to scite , copy to the clipboard ( ctrl-A, ctrl-C ) , switch to this code, press F5 and I'm happy

If Not StringInStr($CmdLineRaw, "/ErrorStdOut") And Not @Compiled Then Exit MsgBox( 262144 , @ScriptName, "please run from Editor", 10)

Local $s = ClipGet()
If Not StringInStr($s, '<?mso-application progid="Word.Document"?>') Then Exit MsgBox(262144, StringTrimRight(@ScriptName, 4), "tested only in Word.Document XML" & @CR & @CR & "no changes made to clipboard", 20)
Local $sOut = msWordXML_Beautify($s, 2) ; 2=return as beautified string, 1=ConsoleWrite beautified string, 0=return beautified array
ClipPut($sOut)
MsgBox(262144, StringTrimRight(@ScriptName, 4), "clipboard content replaced by beautified XML", 2)


Func msWordXML_Beautify($s, $iEcho = 0)
    Local $iTimer = TimerInit()
    $s = StringReplace($s, @CR, '')
    $s = StringReplace($s, @LF, '')
    Local $a = StringSplit($s, "<")
    Local $b[$a[0] * 2]
    Local $i = 0, $c = ""
    For $x = 1 To $a[0]
        If StringReplace($a[$x], @TAB, "") = "" Then ContinueLoop
        If StringInStr($a[$x], ">") Then $a[$x] = StringReplace($a[$x], @TAB, '')
        $c = StringSplit($a[$x], ">")
        If UBound($c) < 2 Then ContinueLoop
        For $y = 1 To $c[0]
            If $y = 1 Then
                $i += 1
                $b[$i] = "<" & $c[$y] & ">"
            Else
                If $c[$y] = "" Then ContinueLoop
                $i += 1
                $b[$i] = $c[$y]
            EndIf
        Next
    Next
    ReDim $b[$i + 1]
    $b[0] = $i
    For $x = 3 To $b[0]
        If Not StringInStr($b[$x - 1], ">") Then
            $b[$x] = $b[$x - 2] & $b[$x - 1] & $b[$x]
            $b[$x - 2] = "<>"
            $b[$x - 1] = "<>"
        EndIf
    Next
    Dim $c[$b[0] + 1]
    $i = 0
    For $x = 1 To $b[0]
        If $b[$x] = "<>" Then ContinueLoop
        $i += 1
        $c[$i] = $b[$x]
    Next
    $b = $c
    $c = ""
    ReDim $b[$i + 1]
    $b[0] = UBound($b) - 1
    Local $tabs = ""
    For $x = 1 To $b[0]
        $b[$x] = StringStripWS($b[$x], 3)
        If StringLeft($b[$x], 2) = "<!" Then ContinueLoop
        If StringLeft($b[$x], 2) = "<?" Then ContinueLoop
        If StringLeft($b[$x], 1) = "<" And StringRight($b[$x], 2) = "/>" Then
            $b[$x] = $tabs & $b[$x]
            ContinueLoop
        EndIf
        If StringLeft($b[$x], 2) = "</" And StringRight($b[$x], 1) = ">" Then
            $tabs = StringTrimRight($tabs, 1)
            $b[$x] = $tabs & $b[$x]
            ContinueLoop
        EndIf
        If StringLeft($b[$x], 1) = "<" And StringRight($b[$x], 1) = ">" And Not StringInStr($b[$x], '</') Then
            $b[$x] = $tabs & $b[$x]
            $tabs &= @TAB
            ContinueLoop
        EndIf
        $b[$x] = $tabs & $b[$x]
    Next
    ConsoleWrite('+ msWordXML_Beautify done in about ' & Round(TimerDiff($iTimer), 5) & ' mSec.' & @CRLF)
    Local $sOut = ""
    If $iEcho Then
        For $x = 1 To $b[0]
            If $iEcho = 1 Then
                ConsoleWrite( $b[$x] & @CRLF )
            Else
                $sOut &= $b[$x] & @CRLF
            EndIf
        Next
    EndIf
    If $iEcho = 2 Then Return $sOut
    Return $b
EndFunc   ;==>msWordXML_Beautify

..hope it saves time to someone.

Edit 1: it works nice with <?mso-application progid="Excel.Sheet"?>, it may just work with any XML, no clue.

Edit 2: fixed an error in the code

Edited by argumentum

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • Jos
      By Jos
      I am reaching out to you all to test this version of Tidy to see whether that works on your scripts and report back your finding.
      Changes in this version:
      Leave inline comments on the original position when the line is tidied, where possible. Handle None Breaking Spaces (NBS) for Ascii and UTF8 files: NBS characters within the leading & trailing whitespace and inside the code will be replaced by a regular space to allow tidy to properly tidy the code. NBS characters in a literal string will not be replaced as they are presumably there for a reason. NBS characters inside an inline comment will be left alone too. Added support for directive: #Tidy_ILC_Pos=nn (See below for details) Replaced /Skip_Commentblock & /scb  for /Tidy_Comment block & /tcb and made the default not to tidy comment blocks, as they are frequently used for simple documentation which would making tidying impossible. /tcb=0               =>only indent the whole commentblock  (default) /tcb  or /tcb=1 =>Tidy inside commentblock /tcb=-1              =>leave whole commentbock alone Added support for the following tidy.ini settings: #-->Define the TabSize size
      tabsize=4 #--> Tidy commentblock 1=Tidy, 0=Do not tidy  default=0
      Tidy_commentblock=0
      #-->Define the default column for inline comments, 0=keep current position
      icl_pos=0 Changed the logic for the comment part of #directives. See last example for an explanation Fixed issue in "Variable proper" function, sometimes not updating the variable to the proper caps.  Changed the program to CUI to show console output when ran from CMD prompt. Changed the returncode to -1 when no changes are made. A change in AutoIt3Wrapper will test for this and show as such. Updated support for inline comments:
      One of the items that has been on my wish-/todolist for a long time was to leave inline comments as they were, instead of removing all whitespace between statement and inline comment: For example that this untidied code:
      global $A=1 if $a=1 then ; test $a $a=2 ; Set $a endif ; end test The current production version will change that to:
      Global $A = 1 If $a = 1 Then ; test $a $a = 2 ; Set $a EndIf ; end test The new Beta version will change that to: (leaving the Comment start column the same as it was)
      Global $A = 1 If $A = 1 Then ; test $a $A = 2 ; Set $a EndIf ; end test Tidy.Ini optional change
      Tidy will determine the position of the inline comment by using the tabchar & tabsize parameter from Tidy.ini:
      [ProgramSettings] * * * Indent 0 = Tabs >0 = Number of Space tabchar=0 tabsize=4 #-->Define the default column for inline comments, 0=keep current position icl_pos=0 #--> Tidy commentblock 0=only indent the whole commentblock (default=0) # 1=Tidy inside commentblock # -1=leave whole commentbock alone Tidy_commentblock=0 When not provided in the INI it will be defaulted to Using Tabs with a size of 4.
      When running from within SciTE, a warning is shown in the SciTE console, when they deviate from the SciTE Settings for:
      use.tabs=1
      indent.size=4
      It could help while testing to use Winmerge to see what exactly was changed during the Tidy run by adding this to Tidy.ini:
      ShowDiffPgm = """C:\Program Files (x86)\WinMerge\winmergeu.exe" "%new%" "%old%""" Support for directive #Tidy_ILC_Pos=nn
      Tidy will keep inline comment on their original position, but this can be overridden by the directive: #Tidy_ILC_Pos=nn
      When nn is greater than 0, Tidy will use the defined number as the Inline comment start column going forward.
      When nn= 0 then use the old ILC column. 
      When nn= -1 then strip the spaces between Code and ILC (Old Tidy behavior)
      EG  Original Tidied script:
      Global Enum _ $ADO_ERR_SUCCESS, _ ; No Error $ADO_ERR_GENERAL, _ ; General - some ADO Error - Not classified type of error $ADO_ERR_ENUMCOUNTER ; just for testing Global Const $ADO_EXT_DEFAULT ; default Extended Value Global Const $ADO_EXT_PARAM1 ; Error Occurs in Parameter #1 Global Const $ADO_EXT_PARAM2 ; Error Occurs in Parameter #2 Global Const $ADO_EXT_PARAM3 ; Error Occurs in Parameter #3 Now with new directive added to line up the inline comments:
      #Tidy_ILC_Pos=40 Global Enum _ $ADO_ERR_SUCCESS, _ ; No Error $ADO_ERR_GENERAL, _ ; General - some ADO Error - Not classified type of error $ADO_ERR_ENUMCOUNTER ; just for testing Global Const $ADO_EXT_DEFAULT ; default Extended Value Global Const $ADO_EXT_PARAM1 ; Error Occurs in Parameter #1 Global Const $ADO_EXT_PARAM2 ; Error Occurs in Parameter #2 Global Const $ADO_EXT_PARAM3 ; Error Occurs in Parameter #3 Example  Setting the new column to different values and back to original (0)
      Before Tidy run:
      #Tidy_ILC_Pos=30 Global Enum _ $ADO_ERR_SUCCESS, _ ; No Error $ADO_ERR_GENERAL, _ ; General - some ADO Error - Not classified type of error $ADO_ERR_ENUMCOUNTER ; just for testing #Tidy_ILC_Pos=50 Global Const $ADO_EXT_DEFAULT ; default Extended Value Global Const $ADO_EXT_PARAM1 ; Error Occurs in Parameter #1 Global Const $ADO_EXT_PARAM2 ; Error Occurs in Parameter #2 Global Const $ADO_EXT_PARAM3 ; Error Occurs in Parameter #3 Global Const $ADO_EXT_PARAM4 ; Error Occurs in Parameter #4 #Tidy_ILC_Pos=0 Global Const $ADO_EXT_PARAM5 ; Error Occurs in Parameter #5 Global Const $ADO_EXT_PARAM6 ; Error Occurs in Parameter #6 Global Const $ADO_EXT_INTERNALFUNCTION ; Error Related to internal Function - should not happend - UDF Developer make something wrong ??? Global Const $ADO_EXT_ENUMCOUNTER ; just for testing After Tidy run:
      #Tidy_ILC_Pos=30 Global Enum _ $ADO_ERR_SUCCESS, _ ; No Error $ADO_ERR_GENERAL, _ ; General - some ADO Error - Not classified type of error $ADO_ERR_ENUMCOUNTER ; just for testing #Tidy_ILC_Pos=50 Global Const $ADO_EXT_DEFAULT ; default Extended Value Global Const $ADO_EXT_PARAM1 ; Error Occurs in Parameter #1 Global Const $ADO_EXT_PARAM2 ; Error Occurs in Parameter #2 Global Const $ADO_EXT_PARAM3 ; Error Occurs in Parameter #3 Global Const $ADO_EXT_PARAM4 ; Error Occurs in Parameter #4 #Tidy_ILC_Pos=0 Global Const $ADO_EXT_PARAM5 ; Error Occurs in Parameter #5 Global Const $ADO_EXT_PARAM6 ; Error Occurs in Parameter #6 Global Const $ADO_EXT_INTERNALFUNCTION ; Error Related to internal Function - should not happend - UDF Developer make something wrong ??? Global Const $ADO_EXT_ENUMCOUNTER ; just for testing Example  what happens with the different directives
      Before Tidy run:
      #Tidy_ILC_Pos=40 #UnknowDirective test ; comment ; Know directives/preprocessor Table -> au3.keywords.properties ;~ au3.keywords.special=#endregion #forcedef #forceref #ignorefunc #pragma #region #Region test ;test ; comment there not enough spaces, so simply copy Eveything after #Region to #EndRegion #EndRegion test ; will be replaced #Region test ; comment there are 4 spaces after #region so can line it up with EndRegion #EndRegion test ; will be replaced #forcedef aaaaaa ; comment - linedup at Pos 40 #pragma compile(UPX, False) ; comment - linedup at Pos 40 ;~ au3.keywords.preprocessor=#ce #comments-end #comments-start #cs #include #include-once \ ;~ #notrayicon #onautoitstartregister #requireadmin #NoTrayIcon ; comment - linedup at Pos 40 #RequireAdmin ; comment - linedup at Pos 40 #OnAutoItStartRegister "test" ; comment - linedup at Pos 40 ; -- Special treatment in au3lexer #cs test ; comment Start - copy all after #CS to #CE #ce test ; will be replaced #comments-Start ; comment Start copy all after #CS to #CE #comments-end ; will be replaced ; Knows Direcitves/Special table -> autoit3wrapper.keywords.properties #AutoIt3Wrapper_Add_Constants=n ; comment - linedup at Pos 40 #Au3Stripper_Ignore_Variables ; comment - linedup at Pos 40 #Tidy_Parameters=1 ; comment - linedup at Pos 40 After Tidy run:
      #Tidy_ILC_Pos=40 #UnknowDirective test ; comment ; Know directives/preprocessor Table -> au3.keywords.properties ;~ au3.keywords.special=#endregion #forcedef #forceref #ignorefunc #pragma #region #Region test ;test ; comment there not enough spaces, so simply copy Eveything after #Region to #EndRegion #EndRegion test ;test ; comment there not enough spaces, so simply copy Eveything after #Region to #EndRegion #Region test ; comment there are 4 spaces after #region so can line it up with EndRegion #EndRegion test ; comment there are 4 spaces after #region so can line it up with EndRegion #forcedef aaaaaa ; comment - linedup at Pos 40 #pragma compile(UPX, False) ; comment - linedup at Pos 40 ;~ au3.keywords.preprocessor=#ce #comments-end #comments-start #cs #include #include-once \ ;~ #notrayicon #onautoitstartregister #requireadmin #NoTrayIcon ; comment - linedup at Pos 40 #RequireAdmin ; comment - linedup at Pos 40 #OnAutoItStartRegister "test" ; comment - linedup at Pos 40 ; -- Special treatment in au3lexer #cs test ; comment Start - copy all after #CS to #CE #ce test ; comment Start - copy all after #CS to #CE #comments-start ; comment Start copy all after #CS to #CE #comments-end ; comment Start copy all after #CS to #CE ; Knows Direcitves/Special table -> autoit3wrapper.keywords.properties #AutoIt3Wrapper_Add_Constants=n ; comment - linedup at Pos 40 #Au3Stripper_Ignore_Variables ; comment - linedup at Pos 40 #Tidy_Parameters=1 ; comment - linedup at Pos 40 I like to thank @mLipok for the testing/ideas/questions during the development of this change.
      Jos
    • ijourneaux
      By ijourneaux
      I am trying to read an XML file that looks like the following. I am particularly interested in the ParameterNames and ParameterValues
       
      I was able to read a simplier XML file using
      $oXML.load("DataForwardSettings.xml") Local $oInfos = $oXML.selectnodes("//Database") ; or //Info or //Data//Info or //Values/Info  but have not been able to read
      <?xml version="1.0"?> <Entities> <Entity RecordType="TrendData"> <Property Name="AlarmLimitsSetNumber" IsReadOnly="False" ValueType="System.Int32">8</Property> <Property Name="AnalysisParamaterSetNumber" IsReadOnly="False" ValueType="System.Int32">8</Property> <Property Name="ParameterNames" IsReadOnly="True" IsList="True" ListType="List<string>" ValueType="Array" ArrayType="System.String" Count="12">System.Collections.Generic.List`1[System.String]<Data>OVERALL|PK-PK WAVEFORM|HFD|CREST FACTOR|SYNC 1-6|1X|2X|3X-4X|FTF|BSF|BPFO|BPFI</Data></Property> <Property Name="ParameterValues" IsReadOnly="True" IsList="True" ListType="List<float>" ValueType="Array" ArrayType="System.Single" Count="12">System.Collections.Generic.List`1[System.Single]<Data>0.04706,0.27951,0.02640,4.85608,0.03494,0.01727,0.02256,0.01993,0.00207,0.00060,0.00178,0.00221</Data></Property> <Property Name="NumberOfParameters" IsReadOnly="False" ValueType="System.Int32">12</Property> <Property Name="ModifiedSinceLastDataDump" IsReadOnly="False" ValueType="System.Boolean">False</Property> <Property Name="Load" IsReadOnly="False" ValueType="System.Single">0</Property> <Property Name="RPM" IsReadOnly="False" ValueType="System.Single">140.962</Property> <Property Name="Value" IsReadOnly="False" ValueType="System.Single">-1.1E-20</Property> <Property Name="SampleID" IsReadOnly="False" ValueType="System.Int32">-626794</Property> <Property Name="Timestamp_as_String" IsReadOnly="True" ValueType="System.String">8/18/2018 2:05:33 PM</Property> <Property Name="Timestamp_as_UInt" IsReadOnly="False" ValueType="System.UInt32">1534619133</Property> <Property Name="Timestamp" IsReadOnly="False" ValueType="System.DateTime">8/18/2018 2:05:33 PM</Property> <Property Name="StorageFlag" IsReadOnly="False" ValueType="Enum" EnumType="Emerson.CSI.DataImport.MHM.TrendDataStorageType" EnumValue="2">RPM_And_Overall</Property> <Property Name="Parents" IsReadOnly="False" IsList="True" ListType="List<string>" ValueType="Array" ArrayType="System.String" Count="5">System.Collections.Generic.List`1[System.String]<Data>Database=phmhmdb4ts;C:\RBMdbsrv\CustData\4ts_online_1807.rbm;-99|Area=4TS;-494|Equipment=4THTS;-712|MeasurementPoint=D39;-780|DataCollectionSet=Normal Collection Dryer Rolls;-783</Data></Property> </Entity> </Entities> I tried switching to the XML UDF but was alittle lost in how to use it. I am particularly interested in the ParameterNames and ParameterValues.
    • macran
      By macran
      I want to generating a XML file (test.xml) like as follow:
      <?xml version="1.0" encoding="GBK"?>
      <!DOCTYPE SCHEMA SYSTEM "HGWSPZJK.dtd">  ;I can not generate this line
      <SCHEMA CRC="HGWSPZ201808_9131011571786229XM_CRC.XML" SSSQ="201808" CHSNAME="HGWSDKQD" NAME="HGWSPZ">

      <TAXPAYER CJRMC="sigmagroup" CJRDM="9131011571786229XM" CJLX="DKZK" RECORDCOUNT="411" SBRQ="2018-08-31" NSRMC="sigmagroup" SWSBH="9131011571786229XM">

      <Records>
      <Record BZ="" JKKADM="2244" JKKAMC="shanghai" SE="5907.82" TFRQ="2018-08-23" FPHM="224420181000752586-L02"/>
      <Record BZ="" JKKADM="2244" JKKAMC="shanghai" SE="4742.4" TFRQ="2018-08-21" FPHM="224420181000743016-L01"/>
      <Record BZ="" JKKADM="2244" JKKAMC="shanghai" SE="18720" TFRQ="2018-08-14" FPHM="224420181000719215-L01"/>
      </Records>
      </TAXPAYER>
      </SCHEMA>
      I use XML.UDF  
      Local $oXMLile=_XML_CreateFile(@ScriptDir&"\test.xml","",True) 
      but there is no function CreateDocumentType 
      It is no effort even I test use 
      Local $doct=$oXMLfile.CreateDocumentType("SCHEMA", null, "HGWSPZJK.dtd", null)
            $oXmlfile.appendChild($doct)
      pls help me thanks.
       
       
       
       
       
       
    • Skeletor
      By Skeletor
      Hi All,
      This is purely an XML Language question. I need to understand how I can add a value/element in between another XML element. 
      Code below shows the XML file. The info tag has the elements already inserted.
      <Configuration xmlns="http://schemas.datacontract.org/2004/07/Modules.Reporting.DataContracts.LineItems" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> <GenConf> <Info>ID, Site, Name , Site_ID</Info> </GenConf> </Configuration> Now, I want to add a value from a node group into this code. Something like below. But the example below does not work.
      Any suggestions?
       
      <Configuration xmlns="http://schemas.datacontract.org/2004/07/Modules.Reporting.DataContracts.LineItems" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> <GenConf> <Info>ID, Site, a/@id, Name , Site_ID</Info> </GenConf> </Configuration> <ProductName> <a id="Windows Server"/> </ProductName>  
    • bhns
      By bhns
      try it for make flyers old games xml + Gdi, i belive many sources had lost 
      AIT-ADRLIST.au3

×