Jump to content

Extract data from Website. - (Moved) V0.2


Recommended Posts

Hello everyone,

Some time back, when i wasn't in my greatest part of my life, i needed this program for a job that required to copy and paste ~21.000 times.

I made a topic about this and i got stuck with a person who think that i was some kind of coder or i have any degree in IT and C 😄😂😂:lol::lol:

Anywho ... I've searched google about kind of same setting and i come across this topi: https://www.autoitscript.com/forum/topic/186961-copy-paste-data-from-a-web-page/ and also my own topic ... which i'm baffled as the guys in the first topic got along so well and on my topic all people were "encourage" me to read help files :lmao::D:lol::lmao:

Anyhow .... does anyone knows a better topic on this matter? For reading/documenting purposes .... I would really love to try to make autoit to run in other way then using clicks and pastes.
 

Over and out.

And happy new ear to all of you :) too.

Link to comment
Share on other sites

@BogdanNicolescu, I read both of your posts and, as one who learned AutoIt with no prior knowledge of programming, I fully understand the dire frustration you are experiencing. Below is one approach you can try. I am still learning, so I know experienced programmers will definitely find my code needs lot of improvements. But I hope my code will serve as your starting point to learn and improve your AutoIt skill. It would be best if you can collect the information with InetRead() function, which allows you to do the job without having to actually open the website. I didn't use InetRead() because I could not make out the URLs. Maybe someone with more experience and knowledge can help you out on that. My approach involves actually opening and clicking the website, so you cannot let the code run in the background and it will take considerably longer time. In Korea where I live it takes about 3~5 seconds to fetch data for one lecture. It will take less time in your country because of shorter physical distance to the server. My code does not use Excel; it rather creates a text file for the whole lectures, each line containing Denumire, Județ, Localitate and Email fields delimited with tabs. I tested this code under Windows 10 in a PC with 1920x1080 resolution. You may have to change some parts of the code if your PC's resolution is smaller. You are welcome to ask any question regarding my code. I wish you a good luck.

#include <IE.au3>
#include <String.au3>
#include <Array.au3>

Opt("WinTitleMatchMode", 2)

HotKeySet("^!x", "_Exit")
OnAutoItExitRegister("_Exit")

While WinExists("Cartografia")
    WinClose("Cartografia")
WEnd
$oIE = _IECreate("https://www.siiir.edu.ro/carto/")
_IELoadWait($oIE)
WinSetState("Cartografia", "", @SW_MAXIMIZE)

$tableTopColor = 0x006CB7
$tableBorderColor = 0x0058A3
$center = @DesktopWidth/2

Dim $xy1 = 0, $xy2 = 0, $xy3 = 0
While Not (IsArray($xy1) And IsArray($xy2) And IsArray($xy3))
    $xy1 = PixelSearch($center, 100, $center, @DesktopHeight, $tableTopColor)
    $xy2 = PixelSearch($center, 500, $center, @DesktopHeight, $tableBorderColor)
    $xy3 = PixelSearch($center, 500, @DesktopWidth, 500, $tableBorderColor)
    Sleep(50)
WEnd
$tableTopY = $xy1[1]
$tableBottomY = $xy2[1]
$tableRightX = $xy3[0]
$pageNumberX = $tableRightX-200
$pageNumberY = $tableBottomY-30

$fn = FileOpen("Cartografia_școlară.txt", 2+128)
FileWriteLine($fn, "Denumire" & @TAB & "Județ" & @TAB & "Localitate" & @TAB & "Email")

For $page = 1 To 1 ; 1860
    If $page <> 1 Then
        ClipPut($page)
        MouseClick("left", $pageNumberX, $pageNumberY)
        Send("^a")
        Send("^v")
        _IELoadWait($oIE)
    EndIf
    For $i = 1 To 10
        MouseMove($center, $i*31+$tableTopY+90)
        If MouseGetCursor() <> 0 Then
            _Exit()
        EndIf
        MouseClick("left")
        _IELoadWait($oIE)

        $h = TimerInit()
        While 1
            $sText = _IEBodyReadHTML($oIE)
            If Not StringInStr($sText, 'Pagina web :') Then
                Sleep(100)
                ContinueLoop
            EndIf

            $sText = _StringBetween($sText, '</header>', 'Pagina web :')[0]
            $aText = StringSplit($sText, @LF, 1)
            ;_ArrayDisplay($aText)
            $nContact = 0
            $sName = ""
            $sCounty = ""
            $sCity = ""
            $sEmail = ""
            For $j = 1 To $aText[0]
                $s = $aText[$j]
                If StringInStr($s, '<h4 class="modal-title ng-binding" id="myModalLabel">') Then
                    $sName = _StringBetween($s, ">", "</h4>")[0]
                EndIf
                If StringInStr($s, '<p class="col-sm-8 ng-binding">') Then
                    $nContact += 1
                    If $nContact = 1 Then
                        $sCounty = _StringBetween($s, ">", " <")[0]
                    ElseIf $nContact = 2 Then
                        $sCity = _StringBetween($s, ">", " <")[0]
                    ElseIf $nContact = 8 Then
                        $sEmail = _StringBetween($s, ">", " <")[0]
                        ExitLoop
                    EndIf
                EndIf
            Next

            If $sCounty = "-" Then
                If TimerDiff($h) > 10000 Then
                    ExitLoop
                Else
                    Sleep(100)
                    ContinueLoop
                EndIf
            Else
                ExitLoop
            EndIf
        WEnd
        FileWriteLine($fn, $sName & @TAB & $sCounty & @TAB & $sCity & @TAB & $sEmail)
        _IEAction($oIE, "back")
        While 1
            If PixelGetColor($center, $tableBottomY, $oIE) = $tableBorderColor Then
                ExitLoop
            EndIf
        WEnd
        If $page <> 1 Then
            ClipPut($page)
            MouseClick("left", $pageNumberX, $pageNumberY)
            Send("^a")
            Send("^v")
            _IELoadWait($oIE)
        EndIf
    Next
    If Mod($page, 100) = 0 Then
        FileFlush($fn)
    EndIf
Next


Func _Exit()
    If IsDeclared("fn") Then
        FileClose($fn)
    EndIf
    Exit
EndFunc

 

Edited by CYCho
Link to comment
Share on other sites

A lot depends on the website, for instance the one you were referencing was generated using java and is quite complex to parse as the data is dynamic and unique identifiers were difficult to find.  However if your excel spreadsheet was configured as:

A = Name
B = Email

Here is something to get you started, I haven't commented the process as I don't have the time, but hopefully you can work through the code and figure out how it works.  As I mentioned this site was a little more difficult to parse due to java events, so had to use some work arounds which you wouldn't normally need to do.

Hope that helps.

#include <Array.au3>
#include <Excel.au3>
#include <IE.au3>

Local $sWorkbook = @ScriptDir & "\Excel_Temp.xls"
Local $oExcel = _Excel_Open()
If @error Then Exit MsgBox(4096, "", "Error opening Excel")
Local $oWorkbook = _Excel_BookOpen($oExcel, $sWorkbook)
If @error Then
    $oWorkbook = _Excel_BookAttach($sWorkbook)
    If @error Then Exit MsgBox(4096, "", "Error opening workbook")
EndIf
Local $aRange = _Excel_RangeRead($oWorkbook, Default, "A1:B20")
If @error Then Exit MsgBox(4096, "", "Error reading Excel Range")
_ArrayDisplay($aRange)
Local $oIE, $oDivs, $oDenumire, $oSearch, $oResults
$oIE = _IECreate("https://www.siiir.edu.ro/carto/", 1)
For $i = 1 To UBound($aRange) - 1
    $oIE = _IEAttach("https://www.siiir.edu.ro/carto/", "url")
    Sleep(1000)
    $oDivs = _IETagNameGetCollection($oIE, "div")
    If IsObj($oDivs) Then
        For $oDiv In $oDivS
            If $oDiv.className = "ngHeaderCell ng-scope col3 colt3" And $oDiv.getAttribute("tooltip-append-to-body") = Null Then
                $oDenumire = $oDiv.firstElementChild.nextElementSibling.firstElementChild.firstElementChild
                _IEAction($oDenumire, "focus")
                _IEAction($oDenumire, "selectall")
                $oDenumire.value = $aRange[$i][0]
                ControlSend(_IEPropertyGet($oIE, "Title"), "", "", "{END} {BACKSPACE}")
                $oSearch = $oDiv.firstElementChild.nextElementSibling.firstElementChild.firstElementChild.nextElementSibling.firstElementChild
                _IEAction($oSearch, "click")
                Sleep(2000)
                $oDivs = _IETagNameGetCollection($oIE, "div")
                If IsObj($oDivs) Then
                    For $oDiv In $oDivs
                        If $oDiv.className = "ngCanvas" Then
                            $oResults = _IETagNameGetCollection($oDiv, "div")
                            If IsObj($oResults) Then
                                For $oResult In $oResults
                                    If $oResult.getAttribute("ng-style") = "rowStyle(row)" Then
                                        _IEAction($oResult, "click")
                                        Sleep(2000)
                                        $oSpans = _IETagNameGetCollection($oIE, "span")
                                        If IsObj($oSpans) Then
                                            For $oSpan In $oSpans
                                                If $oSpan.InnerText = "Email :" Then
                                                    If StringInStr($aRange[$i][1], $oSpan.nextElementSibling.innerText) Then
                                                        ContinueLoop
                                                    ElseIf $aRange[$i][1] <> "" Then
                                                        $aRange[$i][1] &= ";" & $oSpan.nextElementSibling.innerText
                                                    Else
                                                        $aRange[$i][1] = $oSpan.nextElementSibling.innerText
                                                    Endif
                                                    _IEAction($oIE, "back")
                                                    Sleep(2000)
                                                    ExitLoop 4
                                                EndIf
                                            Next
                                        EndIf
                                    EndIf
                                Next
                            EndIf
                        EndIf
                    Next
                EndIf
            EndIf
        Next
    EndIf
Next
_Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $aRange, "A1")
_Excel_BookSave($oWorkbook)

 

Link to comment
Share on other sites

This is a chromedriver version of my previous code which was based on IE.au3. For installation of chromedriver, see WebDriver wiki or WebDriver UDF. In my opinnion chromedriver is more stable than IE.au3.

; The WebDriver UDF's are in the same directory as this script

#include "wd_core.au3"
#include "wd_helper.au3"
#include <String.au3>
#include <Array.au3>
#include <Excel.au3>

Opt("WinTitleMatchMode", 2)

HotKeySet("^!x", "_Exit")
OnAutoItExitRegister("_Exit")

While WinExists("Cartografia")
    WinClose("Cartografia")
WEnd

Local $sDesiredCapabilities, $sSession, $url = "https://www.siiir.edu.ro/carto/"

$fn = FileOpen("Cartografia_școlară.txt", 2+128)
FileWriteLine($fn, "Denumire" & @TAB & "Județ" & @TAB & "Localitate" & @TAB & "Email")

$_WD_DEBUG = $_WD_Debug_None
$_WD_ERROR_MSGBOX = False
SetupChrome()
_WD_Startup()

$sSession = _WD_CreateSession($sDesiredCapabilities)
Sleep(1000)

_WD_Navigate($sSession, $url)
_WD_LoadWait($sSession)

$maxPage = 0
While $maxPage = 0
    $eInput = _WD_FindElement($sSession,  $_WD_LOCATOR_ByXPath, "//input[@class='ngPagerCurrent form-control ng-pristine ng-valid ng-valid-number ng-valid-max ng-valid-min']")
    $maxPage = _WD_ElementAction($sSession, $eInput, "attribute", "max")
    Sleep(100)
WEnd

For $page = 1010 To 1010 ; $maxPage
    ; wait until row elements are found
    _WD_WaitElement($sSession,  $_WD_LOCATOR_ByXPath, "//div[@class='ngCellText ng-scope col3 colt3']")

    If $page <> 1 Then ; go to the $page
        If $page > $maxPage Then
            ExitLoop
        EndIf
        $eInput = _WD_FindElement($sSession,  $_WD_LOCATOR_ByXPath, "//input[@class='ngPagerCurrent form-control ng-pristine ng-valid ng-valid-number ng-valid-max ng-valid-min']")
        _WD_ElementAction($sSession, $eInput, "value", 1)
        Send("^a")
        Sleep(50)
        ClipPut($page)
        Send("^v")
        Sleep(2000)
    EndIf

    ; obtain array of Denumire fields from the table
    $sSource = _WD_GetSource($sSession)
    $aSource = _StringBetween($sSource, '<div class="ngCellText ng-scope col3 colt3" ng-class="col.colIndex()"><span ng-cell-text="" class="ng-binding">', '</span>')
;   _ArrayDisplay($aSource)

    For $row = 0 To UBound($aSource)-1
        ; find row element id's and click one by one
        $aRows = _WD_FindElement($sSession,  $_WD_LOCATOR_ByXPath, "//div[@class='ngCellText ng-scope col3 colt3']", "", True)
        _WD_ElementAction($sSession, $aRows[$row], "click")
        Sleep(1000)
        _WD_LoadWait($sSession)

        While 1
            $aText = 0
            While Not IsArray($aText) ; wait until all contents of Contact are filled in
                $sText = _WD_GetSource($sSession)
                $aText = _StringBetween($sText, '<p class="col-sm-8 ng-binding">', ' </p>')
                Sleep(50)
            WEnd
            If $aText[0] <> "-" Then ; when Județ field has been updated, we assume that Email field would also have been updated
                FileWriteLine($fn, $aSource[$row] & @TAB & $aText[0] & @TAB & $aText[1] & @TAB & $aText[7])
                ExitLoop
            EndIf
            Sleep(50)
        WEnd
        _WD_Action($sSession, "back") ; go back to main page
        _WD_LoadWait($sSession)
        If $page <> 1 And $row <> UBound($aSource)-1 Then ; main page always goes to page 1, so working page number must be re-entered
            $eInput = _WD_FindElement($sSession,  $_WD_LOCATOR_ByXPath, "//input[@class='ngPagerCurrent form-control ng-pristine ng-valid ng-valid-number ng-valid-max ng-valid-min']")
            _WD_ElementAction($sSession, $eInput, "value", 1)
            Send("^a")
            Sleep(50)
            ClipPut($page)
            Send("^v")
            Sleep(1000)
        EndIf
    Next
Next


Func _Exit()
    _WD_Shutdown()
    Exit
EndFunc

Func SetupChrome()
    If Not FileExists("chromedriver.exe") Then
        MsgBox(4096, "", "ChromeDriver is not available in the sript directory!")
        Exit
    EndIf
    _WD_Option('Driver', 'chromedriver.exe')
    _WD_Option('DriverParams', '--log-path=chrome.log')
    _WD_Option('Port', 9515)

    $sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"unhandledPromptBehavior": "ignore", ' & _
        '"goog:chromeOptions": {"w3c": true, "excludeSwitches": ["enable-automation"], "useAutomationExtension": false, ' & _
        '"prefs": {"credentials_enable_service": false}, ' & _
        '"args": ["start-maximized"] }}}}'
EndFunc

 

Edited by CYCho
Link to comment
Share on other sites

2 hours ago, CYCho said:

In my opinnion chromedriver is more stable than IE.au3

That's a pretty broad statement. In what ways have you found IE.au3 to be "unstable"?

_WD_ElementAction($sSession, $eInput, "value", 1)
        Send("^a")
        ClipPut($page)
        Send("^v")
        Sleep(2000)

Can you explain why you are doing it like this?

 

Link to comment
Share on other sites

4 hours ago, Danp2 said:

That's a pretty broad statement. In what ways have you found IE.au3 to be "unstable"?

About a year ago I had couple of problems with IE. One was inability to detect and read alert message. The other was stopping of loading a page once in a while without apparent explanation. I would not have switched to Chrome and webdriver if I did not have these problems.

4 hours ago, Danp2 said:

 

_WD_ElementAction($sSession, $eInput, "value", 1)
        Send("^a")
        ClipPut($page)
        Send("^v")
        Sleep(2000)

Can you explain why you are doing it like this?

Maybe there is a better way. I didn't like this either because this requires the window to be always on the top. When I issue a _WD_Action($sSession, $sElement, "back") this website always goes back to page 1 regardless of the page number I was working on. The input box defines minimum value of 1 and I cannot "clear" the value. If I try to reset the value with _WD_ElementAction, the option value I give is mixed with existing 1 and produces unpredictable number. The first _WD_ElementAction is to move focus to the input box. Ctrl+a is for swapping the value. I could have used _WD_ElementAction again instead of sending Ctrl+v, but then the website tries to reload table values in between each digit entered by _WD_ElementAction. I wish _WD_ElementAction can paste the value instead of sending charaters one by one with delays in between.

Thanks for reviewing this.

Edited by CYCho
Link to comment
Share on other sites

@CYCho I would approach this project somewhat differently. Since the website serves the data up in JSON format, I would use WinHTTP to submit requests and process the JSON responses.

For example, the appropriate POST request will return the following JSON --

{
    "data": {
        "content": [
            {
                "ROW_NUM": 9991,
                "ID_SCHOOL": 11303404,
                "CODE": "3761105259",
                "NAME": "LICEUL TEORETIC \"MIHAI EMINESCU\", MUN. BÂRLAD",
                "SHORT_NAME": "LIT MIHAI EMINESCU",
                "LOCALITY": "BÂRLAD",
                "PARENT_LOCALITY": "MUNICIPIUL BÂRLAD",
                "COUNTY": "VASLUI",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            },
            {
                "ROW_NUM": 9992,
                "ID_SCHOOL": 11295837,
                "CODE": "4061204816",
                "NAME": "Liceul Teoretic Mihai Ionescu",
                "SHORT_NAME": "Liceul Mihai Ionescu",
                "LOCALITY": "BUCUREŞTI SECTORUL 3",
                "PARENT_LOCALITY": "MUNICIPIUL BUCUREŞTI",
                "COUNTY": "MUNICIPIUL BUCUREŞTI",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Privată"
            },
            {
                "ROW_NUM": 9993,
                "ID_SCHOOL": 11296591,
                "CODE": "0261100581",
                "NAME": "LICEUL TEORETIC \"MIHAI VELICIU\" CHISINEU-CRIS",
                "SHORT_NAME": "LICEUL TEORETIC \"MIHAI VELICIU\" CHISINEU-CRIS",
                "LOCALITY": "CHIŞINEU-CRIŞ",
                "PARENT_LOCALITY": "ORAŞ CHIŞINEU-CRIŞ",
                "COUNTY": "ARAD",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            },
            {
                "ROW_NUM": 9994,
                "ID_SCHOOL": 11305787,
                "CODE": "1661105781",
                "NAME": "LICEUL TEORETIC \"MIHAI VITEAZUL\" BAILESTI",
                "SHORT_NAME": "LICEUL MIHAI VITEAZUL BAILESTI",
                "LOCALITY": "BĂILEŞTI",
                "PARENT_LOCALITY": "MUNICIPIUL BĂILEŞTI",
                "COUNTY": "DOLJ",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            },
            {
                "ROW_NUM": 9995,
                "ID_SCHOOL": 11289398,
                "CODE": "2861100531",
                "NAME": "LICEUL TEORETIC \"MIHAI VITEAZUL\" CARACAL",
                "SHORT_NAME": "LIC.T.\"MIHAI VITEAZUL\" CARACAL",
                "LOCALITY": "CARACAL",
                "PARENT_LOCALITY": "MUNICIPIUL CARACAL",
                "COUNTY": "OLT",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            },
            {
                "ROW_NUM": 9996,
                "ID_SCHOOL": 11293999,
                "CODE": "1561100158",
                "NAME": "LICEUL TEORETIC ”MIHAI VITEAZUL” VIȘINA",
                "SHORT_NAME": "LICEUL TEORETIC ”MIHAI VITEAZUL” VIȘINA",
                "LOCALITY": "VIŞINA",
                "PARENT_LOCALITY": "VIŞINA",
                "COUNTY": "DÂMBOVIŢA",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            },
            {
                "ROW_NUM": 9997,
                "ID_SCHOOL": 11296937,
                "CODE": "1361100903",
                "NAME": "LICEUL TEORETIC 'MIHAIL KOGĂLNICEANU' MIHAIL KOGĂLNICEANU",
                "SHORT_NAME": "LIC TEORETIC MIHAIL KOGĂLNICEANU",
                "LOCALITY": "MIHAIL KOGĂLNICEANU",
                "PARENT_LOCALITY": "MIHAIL KOGĂLNICEANU",
                "COUNTY": "CONSTANŢA",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            },
            {
                "ROW_NUM": 9998,
                "ID_SCHOOL": 11302876,
                "CODE": "3761100046",
                "NAME": "LICEUL TEORETIC \"MIHAIL KOGĂLNICEANU\", MUN. VASLUI",
                "SHORT_NAME": "LIT MIHAIL KOGĂLNICEANU",
                "LOCALITY": "VASLUI",
                "PARENT_LOCALITY": "MUNICIPIUL VASLUI",
                "COUNTY": "VASLUI",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            },
            {
                "ROW_NUM": 9999,
                "ID_SCHOOL": 11291665,
                "CODE": "2361101987",
                "NAME": "LICEUL TEORETIC \"MIHAIL KOGĂLNICEANU\" SNAGOV",
                "SHORT_NAME": "LICEUL TEORETIC \"MIHAIL KOGALNICEANU\" SNAGOV",
                "LOCALITY": "SNAGOV",
                "PARENT_LOCALITY": "SNAGOV",
                "COUNTY": "ILFOV",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            },
            {
                "ROW_NUM": 10000,
                "ID_SCHOOL": 11295698,
                "CODE": "4061102617",
                "NAME": "Liceul Teoretic ”Mihail Sadoveanu”",
                "SHORT_NAME": "Lic.Teor. ”Mihail Sadoveanu”",
                "LOCALITY": "BUCUREŞTI SECTORUL 2",
                "PARENT_LOCALITY": "MUNICIPIUL BUCUREŞTI",
                "COUNTY": "MUNICIPIUL BUCUREŞTI",
                "STATUT": "Cu personalitate juridică",
                "SCHOOL_TYPE": "Unitate de învăţământ",
                "PROPERTY_FORM": "Publică de interes naţional şi local"
            }
        ],
        "pageable": {
            "sort": null,
            "filters": null,
            "pageSize": 10,
            "pageNumber": 1000,
            "offset": 10000
        },
        "filters": [],
        "metadata": [
            {
                "field": "ROW_NUM",
                "displayName": "schoolnetwork_grid.key.ROW_NUM.label",
                "fieldFilterType": "equals",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "ID_SCHOOL",
                "displayName": "schoolnetwork_grid.key.ID_SCHOOL.label",
                "fieldFilterType": "equals",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "CODE",
                "displayName": "schoolnetwork_grid.key.CODE.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "NAME",
                "displayName": "schoolnetwork_grid.key.NAME.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "SHORT_NAME",
                "displayName": "schoolnetwork_grid.key.SHORT_NAME.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "LOCALITY",
                "displayName": "schoolnetwork_grid.key.LOCALITY.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "PARENT_LOCALITY",
                "displayName": "schoolnetwork_grid.key.PARENT_LOCALITY.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "COUNTY",
                "displayName": "schoolnetwork_grid.key.COUNTY.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "STATUT",
                "displayName": "schoolnetwork_grid.key.STATUT.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "SCHOOL_TYPE",
                "displayName": "schoolnetwork_grid.key.SCHOOL_TYPE.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "PROPERTY_FORM",
                "displayName": "schoolnetwork_grid.key.PROPERTY_FORM.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            }
        ],
        "sort": null,
        "totalElements": 18591,
        "totalPages": 1860,
        "numberOfElements": 10,
        "firstPage": false,
        "lastPage": false,
        "size": 10,
        "number": 1000
    },
    "meta": {
        "title": "schoolnetwork_grid.key.grid.title",
        "properties": {
            "metadatas": [
                {
                    "displayName": null,
                    "field": "NAME",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "SHORT_NAME",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "CODE",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "LOCALITY",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "PARENT_LOCALITY",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "COUNTY",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "TEACHING_LANGUAGE",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "STATUT",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "SCHOOL_TYPE",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "STATUT",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "ID_SCHOOL",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                },
                {
                    "displayName": null,
                    "field": "PROPERTY_FORM",
                    "fieldFilterType": null,
                    "fieldType": "java.lang.String",
                    "resizable": true
                }
            ],
            "primaryKey": {
                "columnName": "ID_SCHOOL",
                "visible": false
            }
        },
        "cols": [
            {
                "field": "ROW_NUM",
                "displayName": "schoolnetwork_grid.key.ROW_NUM.label",
                "fieldFilterType": "equals",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "ID_SCHOOL",
                "displayName": "schoolnetwork_grid.key.ID_SCHOOL.label",
                "fieldFilterType": "equals",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "CODE",
                "displayName": "schoolnetwork_grid.key.CODE.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "NAME",
                "displayName": "schoolnetwork_grid.key.NAME.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "SHORT_NAME",
                "displayName": "schoolnetwork_grid.key.SHORT_NAME.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "LOCALITY",
                "displayName": "schoolnetwork_grid.key.LOCALITY.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "PARENT_LOCALITY",
                "displayName": "schoolnetwork_grid.key.PARENT_LOCALITY.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "COUNTY",
                "displayName": "schoolnetwork_grid.key.COUNTY.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "STATUT",
                "displayName": "schoolnetwork_grid.key.STATUT.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "SCHOOL_TYPE",
                "displayName": "schoolnetwork_grid.key.SCHOOL_TYPE.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            },
            {
                "field": "PROPERTY_FORM",
                "displayName": "schoolnetwork_grid.key.PROPERTY_FORM.label",
                "fieldFilterType": "like",
                "fieldType": null,
                "resizable": true
            }
        ]
    }
}

You could then use the ID_SCHOOL from each entry to retrieve the details for that particular school. For example, the following is returned when you perform a GET request on https://www.siiir.edu.ro/carto/app/rest/school/details/11303404 --

{
    "idSchool": 11303404,
    "internalIdSchool": 11052223,
    "idParentSchool": null,
    "schoolSocialLinks": [],
    "idSchoolYear": {
        "orderBy": 102,
        "isFutureYear": 0,
        "isCurrentYear": 1,
        "dateTo": 1598832000000,
        "dateFrom": 1567296000000,
        "description": "Anul şcolar 2019-2020",
        "code": "2019-2020",
        "idSchoolYear": 22
    },
    "schoolYearDescription": "Anul şcolar 2019-2020",
    "code": "3761105259",
    "siruesCode": "733633",
    "longName": "LICEUL TEORETIC \"MIHAI EMINESCU\", MUN. BÂRLAD",
    "shortName": "LIT MIHAI EMINESCU",
    "schoolType": "Unitate de învăţământ",
    "statut": "Cu personalitate juridică",
    "isPj": true,
    "fiscalCode": "4446562",
    "operatingMode": "Un schimb/zi",
    "propertyForm": "Publică de interes naţional şi local",
    "fundingForm": "Buget",
    "county": "VASLUI",
    "locality": "BÂRLAD",
    "street": "Eminescu Mihai",
    "streetNumber": " 1",
    "postalCode": "731199",
    "phoneNumber": "0235413003",
    "faxNumber": "0235413003",
    "email": "liceminescubarlad@yahoo.com",
    "schoolNumbers": {
        "idSchool": 11303404,
        "studyFormationsCount": 43,
        "studentsCount": 1181,
        "personnelCount": 119
    }
}

Finally, you may want to look into this github repository as it seems that others are using it to extract data from this same site.

Link to comment
Share on other sites

@Danp2 That's it. As I mentioned in my earlier post above, I thought there would be a way to obtain the details without actually opening the web page. I knew you would know. Regretfully, I know nothing about WinHTTP and JSON. You gave me a difficult home work to do.

How did you find out the URL of shool details https://www.siiir.edu.ro/carto/app/rest/school/details/11303404 ?

Edited by CYCho
Link to comment
Share on other sites

10 minutes ago, CYCho said:

To let me start off, could you tell me what was the "appropriate POST" you used? 

This can also be found using the same technique I mentioned above. Each time you switch to a new "page" of the grid, the website makes a call to retrieve the new data to be loaded. Postman is a great tool for testing this kind of stuff.

Link to comment
Share on other sites

  • 1 month later...

Thank you all,

 

Thank you very much. You GUYS ARE GOLD ❤️❤️❤️❤️❤️

Wish that you guys have had responded me back when i needed this to copy 10.000 lines :))) Crazy job to do ...

Whelp ... this kind of discussion i have hoped back then too ... but instead they were sending me on wild goose chase when i needed to finish it in like 3 days ... or something.

But now i will now if i will ever come across something like this.

 

If you guys want to test things out, here: https://warclicks.com/country_stats is another link you could play on. I will do my own research and i will post here.

Thank you all again, YOU ARE GOLD ❤️❤️❤️❤️❤️

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...