MattHiggs

Duplicate bookmark removal

10 posts in this topic

#1 ·  Posted (edited)

Hey all.  So I had an issue recently where xmarks, a cross-browser bookmark synchronization add-on, has completely jacked up my bookmarks (duplicates everywhere, items in incorrect folders, just a disaster).  Needless to say xmarks is now on my crap-list of never touch again, but I still needed a way to get my bookmarks back to normal.  Unfortunately, the categorization part will have to probably be done manually, but I figured removing the duplicates should be scriptable.  This script is the result of that need: you start by going to a browser which has the messed up bookmarks, go to the bookmark manager, then exporting the bookmarks as a html bookmark file.  Then you just open aforementioned file within this scripts prompt, select a save location for the new html file that will be generated, and wait for the script to finish.  Now, as of now, this script doesn't really let you combine multiple html files into a single one with all unique values added in due to the nature of my issue, but if you need this functionality, I could always add it in (or you could, doesn't really matter to me :P).  So, without further ado:

#Region ;**** Directives created by AutoIt3Wrapper_GUI ****
#AutoIt3Wrapper_UseX64=y
#AutoIt3Wrapper_Change2CUI=y
#AutoIt3Wrapper_Res_SaveSource=y
#AutoIt3Wrapper_Res_Language=1033
#AutoIt3Wrapper_Res_requestedExecutionLevel=highestAvailable
#EndRegion ;**** Directives created by AutoIt3Wrapper_GUI ****
#cs ----------------------------------------------------------------------------

 AutoIt Version: 3.3.15.0 (Beta)
 Author:  William Higgs

 Script Function:
    This script is meant to scan the html bookmark file exported by any internet browser, scan the file for any duplicate bookmark items,
    remove them, cleanup any empty leftover groups, then create a new html bookmark file that can then be imported back into your browser,
    removing all duplicate bookmark links.

#ce ----------------------------------------------------------------------------

; Script Start - Add your code below here
#include <Array.au3>
#include <Constants.au3>
$open = FileOpenDialog ( "Select html containing exported bookmarks", "", "HTML file (*.html)", 3 )
$save = FileSaveDialog ( "Where do you want to save modified bookmark file?", "", "HTML file (*.html)", 18 )
ProgressOn ( "Please wait", "Scanning bookmark file for bookmarks", "", Default, Default, 18 )
$oIE = FileReadToArray ( $open )
$num = @extended
Local $old[$num]
For $h = 0 To $num - 1 Step 1
    $old[$h] = StringStripWS ( $oIE[$h], 3 )
Next
Local $old2[$num]
$num2 = 0
For $i = 0 To $num - 1 Step 1
    If StringCompare ( StringLeft ( $old[$i], 6 ), "<DT><A" ) = 0 Then
        $begin = StringInStr ( $old[$i], '"' )
        $end = StringInStr ( $old[$i], '"', Default, 2 )
        $old2[$i] = StringMid ( $old[$i], $begin + 1, $end - $begin )
    Else
        $old2[$i] = "NA"
    EndIf
Next
$index = _ArrayFindAll ( $old2, "NA" )
Local $temp[UBound ( $index ) + 1]
$temp[0] = UBound ( $index )
For $i = 0 To UBound ( $index ) - 1 Step 1
    $temp[$i+1] = $index[$i]
Next

$newnum = _ArrayDelete ( $old2, $temp )
$old2 = _ArrayUnique ( $old2 )

For $i = 1 To $old2[0] Step 1
    ProgressSet ( ( $i / $old2[0] ) * 100, $i & " out of " & $old2[0], "Removing duplicates..." )
    $indexi = _ArrayFindAll ( $old, $old2[$i], Default, Default, Default, 1 )
    If UBound ( $indexi ) > 1 Then
        Local $delete[UBound ( $indexi )]
        $delete[0] = UBound ( $indexi ) - 1
        For $t = 1 To UBound ( $indexi ) - 1 Step 1
            $delete[$t] = $indexi[$t]
        Next
        _ArrayDelete ( $old, $delete )
    Else
        ContinueLoop
    EndIf
Next

ProgressSet ( 0, "Will begin shortly.", "Cleaning up the file." )
cleanup ( $old )

$hold = 0
$file = FileOpen ( $save, 2 )
ProgressSet ( 0, "Will begin shortly.", "Writing new bookmarks file." )
For $i = 0 To UBound ( $old ) - 1 Step 1
    ProgressSet ( ( $i / UBound ( $old ) - 1 ) * 100, $i & " out of " & UBound ( $old ) - 1, "Writing new bookmarks file." )
    If StringCompare ( StringLeft ( $old[$i], 7 ), "<DL><p>" ) = 0 Then
        If $hold = 0 Then
            FileWriteLine ( $file, $old[$i] )
            $hold += 1
        Else
            For $o = 1 To $hold Step 1
                FileWrite ( $file, @TAB )
            Next
            FileWriteLine ( $file, $old[$i] )
            $hold += 1
        EndIf
    ElseIf StringCompare ( StringLeft ( $old[$i], 8 ), "</DL><p>" ) = 0 Then
        If $hold = 1 Then
            $hold -= 1
            FileWriteLine ( $file, $old[$i] )
        Else
            $hold -= 1
            For $o = 1 To $hold Step 1
                FileWrite ( $file, @TAB )
            Next
            FileWriteLine ( $file, $old[$i] )
        EndIf
    Else
        If $hold = 0 Then
            FileWriteLine ( $file, $old[$i] )
        Else
            For $o = 1 To $hold Step 1
                FileWrite ( $file, @TAB )
            Next
            FileWriteLine ( $file, $old[$i] )
        EndIf
    EndIf
Next
FileClose ( $file )
ProgressOff ()
#Region --- CodeWizard generated code Start ---

;MsgBox features: Title=Yes, Text=Yes, Buttons=OK, Icon=Info
MsgBox($MB_OK + $MB_ICONASTERISK,"Finished",'Your new bookmark html file can be found at "' & $save & '" and imported back into your browser.')
#EndRegion --- CodeWizard generated code End ---


Func cleanup ( ByRef $the )
    $restart = False
    For $i = 0 To UBound ( $the ) - 1 Step 1
        If StringCompare ( StringLeft ( $the[$i], 8 ), "<DT><H3>" ) = 0 Then
            If StringCompare ( StringLeft ( $the[$i+2], 8 ), "</DL><p>" ) = 0 Then
                _ArrayDelete ( $the, $i & "-" & $i+2 )
                $restart = True
                ExitLoop
            Else
                ContinueLoop
            EndIf
        Else
            ContinueLoop
        EndIf
    Next
    If $restart = True Then
        cleanup ( $the )
    EndIf
EndFunc

Edit: Updated the script.  I found the way it was before, it skipping over urls the were duplicates, but had a different display name.  This version looks solely at the URL and compares the URL values.  It also cleans up any empty leftover folders at the end.

 

Edit 9/10/2017: For those of you who were having issues compiling the script, as stated, you are likely using an outdated version of autoit to compile it.  I copied and pasted the script into my editor and compiled it just fine, which I will now attach to this post.

dup_book.exe

Edited by MattHiggs

Share this post


Link to post
Share on other sites



Just a hint in case you're refering to Fifefox bookmarks: you may have better time using the SQLite bookmark database directly rather than having to mess up with the html export. Depending on what you need to do exactly you may even be able to get along with a good SQLite 3rd-party manager (SQLite Expert is perfect) without having to write a single line of AutoIt code.

Tech points to consider: your profile bookmarks are in C:\Users\<username>\AppData\Roaming\Mozilla\Firefox\Profiles\<profile>.default where the places.sqlite DB lives. This is a WAL mode DB, so there are likely two extra files places.sqlite-shm and places.sqlite-wal when Firefow runs. Don't just copy the .sqlite file alone, it will be corrupt if not used with the two other files.

Then for safety duplicate the moz_bookmarks table to "bookmarks_backup" then again to "bookmarks_with_dups" and de-dup from there using SQL. When you're sure to get your SQL correct that delivers the result you want you can empty the real bookmark table and copy back your resulting output there (don't DROP the table, just delete from it). Afterwards, you can drop your backup tables when you're sure everything works fine, no hurry since FF will obviously ignore the extra tables you've created for the operation.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
7 hours ago, jchd said:

Just a hint in case you're refering to Fifefox bookmarks: you may have better time using the SQLite bookmark database directly rather than having to mess up with the html export. Depending on what you need to do exactly you may even be able to get along with a good SQLite 3rd-party manager (SQLite Expert is perfect) without having to write a single line of AutoIt code.

Tech points to consider: your profile bookmarks are in C:\Users\<username>\AppData\Roaming\Mozilla\Firefox\Profiles\<profile>.default where the places.sqlite DB lives. This is a WAL mode DB, so there are likely two extra files places.sqlite-shm and places.sqlite-wal when Firefow runs. Don't just copy the .sqlite file alone, it will be corrupt if not used with the two other files.

Then for safety duplicate the moz_bookmarks table to "bookmarks_backup" then again to "bookmarks_with_dups" and de-dup from there using SQL. When you're sure to get your SQL correct that delivers the result you want you can empty the real bookmark table and copy back your resulting output there (don't DROP the table, just delete from it). Afterwards, you can drop your backup tables when you're sure everything works fine, no hurry since FF will obviously ignore the extra tables you've created for the operation.

You know, I had discovered that while I was researching various aspects of automating bookmark processes, but for the life of me, I couldn't figure out how to query the database for the information that I needed using autoit's splite udf.  The array I kept getting was null.  And I did all of this without removing any of the databases or files from their original position.  Now I'm not saying you are wrong, that is on me, the sqlite UDf confused the hell out of me.  Or are you suggesting skipping the sqlite udf altogether and just run SQL directly within the script?

Share this post


Link to post
Share on other sites

As I said, you can perform all SQL operations you need from within a solid 3rd-party SQLite manager. Writing a script around this is just calling for pointless complications. I warmly recommend SQlite Expert.

Just ask if you need assistance doing so.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
6 minutes ago, jchd said:

As I said, you can perform all SQL operations you need from within a solid 3rd-party SQLite manager. Writing a script around this is just calling for pointless complications. I warmly recommend SQlite Expert.

Just ask if you need assistance doing so.

Ok.  I will definitely look into that.  The positive about doing it this way though is that it can be done with whatever browser you use: Chrome and firefox both have bookmark managers that let you export and import html files, so that was the other benefit of doing it this way.  But I will still definitely see if I can't ready database files directly.

Share this post


Link to post
Share on other sites

#6 ·  Posted

All nice and good ... small problem ... 

--------------------------------------------------------------

; Script Start - Add your code below here
#include <Array.au3>
#include <Constants.au3>
$open = FileOpenDialog ( "Select html containing exported bookmarks", "", "HTML file (*.html)", 3 )
$save = FileSaveDialog ( "Where do you want to save modified bookmark file?", "", "HTML file (*.html)", 18 )
ProgressOn ( "Please wait", "Scanning bookmark file for bookmarks", "", Default, Default, 18 )
$oIE = FileReadToArray ( $open ) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

ERROR: FileReadToArray(): Undefined function ; That is line 26.


$num = @extended
Local $old[$num]
For $h = 0 To $num - 1 Step 1
    $old[$h] = StringStripWS ( $oIE[$h], 3 )
Next
Local $old2[$num]
$num2 = 0

Screenshot_1.jpg

Share this post


Link to post
Share on other sites

#7 ·  Posted

And I was looking for this ... I wrote automation to check the URLS from bookmarks manager but takes time and you can't do anything else :)

HEY I'm NOT a programmer, wish I was!!

So after finding the bookmarks to be same name different URL I created one more to remove by name ... yeah I know ... kid's stuff for you but hey for a non-programmer is better than nothing :)

But before getting into the deep with this I thought I will do a specific search with autoit in mind and here I am.

Unfortunately the script does not work, doesn't run and obviously does not compile either.

I hope you work it out William. Laterz guys.

Share this post


Link to post
Share on other sites

#8 ·  Posted

Last but not least ... the language ... is it a popular one? Is it C or C++?

Because if it is I think I will add to the "to do" list  :) 

Share this post


Link to post
Share on other sites

#9 ·  Posted

Make sure you're running the latest version of AutoIt, the language this is written in. Looks like you might have an older version from that error message.


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

Updated original post by attaching the compiled executable from the posted script.  Works fine guys.  Not sure why you can't compile it, but I would bet @BrewManNH has the best hypothesis.  

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now