Jump to content

StringRegExpReplace and HTML-Tags


 Share

Recommended Posts

I am using this page http://www.cpu-world.com/CPUs/Core_2/Intel...0562Q6600).html

I am have problems with this rstringreg expression.

StringRegExpReplace($HTML,(?i)(<a href=\".*\"\>),'')

I want to remove all the links but leave the text, it works for the first coupple but the get 2 in one array

0 => <A href="http://www.cpu-world.com/CPUs/CPU.html">

1 => <A href="http://www.cpu-world.com/CPUs/Core_2/TYPE-Core%202%20Quad.html">

2 => <A href="http://www.cpu-world.com/sspec/SL/SL9UM.html">SL9UM</A> &nbsp; <A href="http://www.cpu-world.com/sspec/SL/SLACR.html">

3 => <A href="http://www.cpu-world.com/sspec/QM/QMAQ.html">QMAQ</A> &nbsp; <A href="http://www.cpu-world.com/sspec/QU/QUPT.html">QUPT</A> &nbsp; <A href="http://www.cpu-world.com/sspec/QX/QXVD.html">QXVD</A> &nbsp; <A href="http://www.cpu-world.com/sspec/SL/SL9UM.html">SL9UM</A> &nbsp; <A href="http://www.cpu-world.com/sspec/SL/SLACR.html">

4 => <A href="http://www.cpu-world.com/Sockets/Socket%20775%20(LGA775).html">

I have looked on the forum :D

This is the output I am trying to get

0 => <A href="http://www.cpu-world.com/CPUs/CPU.html">

1 => <A href="http://www.cpu-world.com/CPUs/Core_2/TYPE-Core%202%20Quad.html">

2 =><A href="http://www.cpu-world.com/sspec/SL/SL9UM.html">

3 => <A href="http://www.cpu-world.com/sspec/SL/SLACR.html">

Thanks,

Keith

Source

<table width="65%"><tr><td bgcolor="#FFD060"><h1>Intel Core 2 Quad Q6600 HH80562PH0568M (BX80562Q6600)</h1></td><tr><td><!--

*** Intel Core 2 Quad Q6600 HH80562PH0568M (BX80562Q6600) INFORMATION ***

-->

<TABLE class=dh_table cellSpacing=0 cellPadding=0 width="100%" border=0>

<TBODY>

<TR bgColor=#c0c0c0>

<TD vAlign=top align=middle colSpan=2><B>General information</B></TD></TR>

<TR>

<TD vAlign=top>Type</TD>

<TD>CPU / Microprocessor</A></TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top>Family</TD>

<TD>Intel Core 2 Quad</A></TD></TR>

<TR>

<TD vAlign=top>Model number

<SPAN class="_link cw_help _MODELN"><A href="http://www.cpu-world.com/Glossary/P/Processor_Model_number.html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD>Q6600</TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top>Part number</TD>

<TD>HH80562PH0568M</TD></TR>

<TR>

<TD vAlign=top>BX80562Q6600 s-specs</TD>

<TD>SLACR</A></TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top>HH80562PH0568M s-specs</TD>

<TD>SLACR</A></TD></TR>

<TR>

<TD vAlign=top><B>Frequency (MHz)</B></TD>

<TD><B>2400</B></TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top><B>Bus speed (MHz)</B> <SPAN class="_link cw_help _FSB"><A href="http://www.cpu-world.com/Glossary/F/Front_Side_Bus_(FSB).html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD><B>1066</B></TD></TR>

<TR>

<TD vAlign=top><B>Clock multiplier</B> <SPAN class="_link cw_help _CLOCK_MULT"><A href="http://www.cpu-world.com/Glossary/B/Bus_clock_multiplier.html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD><B>9</B></TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top>Package</TD>

<TD>775-land Flip-Chip Land Grid Array (FC-LGA6)<BR>1.48" x 1.48" (3.75 cm x 3.75 cm)</TD></TR>

<TR>

<TD vAlign=top><B>Socket</B></TD>

<TD><B>Socket 775 (LGA775)</A></B></TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top>Introduction date</TD>

<TD>Jan 8, 2007</TD></TR>

<TR>

<TD vAlign=top>Price at introduction</TD>

<TD>$851</TD></TR>

<TR>

<TD colSpan=2>&nbsp;</TD></TR>

<TR bgColor=#c0c0c0>

<TD vAlign=top align=middle colSpan=2><B>Architecture / Microarchitecture</B></TD></TR>

<TR>

<TD vAlign=top>Manufacturing process</TD>

<TD>0.065 micron</TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top><B>Data width</B></TD>

<TD><B>64 bit</B></TD></TR>

<TR>

<TD vAlign=top><B>Number of cores</B></TD>

<TD><B>4</B></TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top>Floating Point Unit</TD>

<TD>Integrated</TD></TR>

<TR>

<TD vAlign=top>Level 1 cache size <SPAN class="_link cw_help _L1"><A href="http://www.cpu-world.com/Glossary/L/Level_1_cache.html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD>4 x 32 KB instruction caches<BR>4 x 32 KB data caches</TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top><B>Level 2 cache size</B> <SPAN class="_link cw_help _L2"><A href="http://www.cpu-world.com/Glossary/L/Level_2_cache.html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD><B>2 x 4 MB L2 caches (each L2 cache is shared between 2 cores)</B></TD></TR>

<TR>

<TD vAlign=top>Features</TD>

<TD>

<UL>

<LI>MMX instruction set

<LI>SSE

<LI>SSE2

<LI>SSE3

<LI>Supplemental SSE3

<LI>EM64T technology

<LI>Virtualization Technology

<LI>Execute Disable Bit technology </LI></UL></TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top>Low power features</TD>

<TD>

<UL>

<LI>Enhanced SpeedStep technology <SPAN class="_link cw_help _ENH_SSTEP"><A href="http://www.cpu-world.com/Glossary/E/Enhanced_SpeedStep_technology.html" target=_blank>&nbsp;?&nbsp;</A></SPAN>

<LI>Stop Grant state <SPAN class="_link cw_help _STOP_GRANT_MODE"><A href="http://www.cpu-world.com/Glossary/S/Stop_Grant_state.html" target=_blank>&nbsp;?&nbsp;</A></SPAN>

<LI>Halt or Extended Halt state </LI></UL></TD></TR>

<TR>

<TD colSpan=2>&nbsp;</TD></TR>

<TR bgColor=#c0c0c0>

<TD vAlign=top align=middle colSpan=2><B>Electrical/Thermal parameters</B></TD></TR>

<TR>

<TD vAlign=top>V core (V) <SPAN class="_link cw_help _VCORE"><A href="http://www.cpu-world.com/Glossary/C/Core_voltage.html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD>0.85 - 1.5</TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top>Max operating temperature (°C) <SPAN class="_link cw_help _MIN_MAX_TEMP"><A href="http://www.cpu-world.com/Glossary/M/Minimum_Maximum_operating_temperatures.html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD>60.3</TD></TR>

<TR>

<TD vAlign=top>Max power dissipation (W) <SPAN class="_link cw_help _MIN_MAX_POWER"><A href="http://www.cpu-world.com/Glossary/M/Minimum_Maximum_power_dissipation.html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD>155.25</TD></TR>

<TR bgColor=#f0f0f0>

<TD vAlign=top><B>Thermal Design Power (W)</B> <SPAN class="_link cw_help _TDP"><A href="http://www.cpu-world.com/Glossary/T/Thermal_Design_Power_(TDP).html" target=_blank>&nbsp;?&nbsp;</A></SPAN></TD>

<TD><B>105</B></TD></TR>

<TR>

<TD colSpan=2>&nbsp;</TD></TR>

<TR bgColor=#c0c0c0>

<TD align=middle colSpan=2><B>Notes on Intel HH80562PH0568M</B></TD></TR>

<TR bgColor=#ffffff>

<TD colSpan=2>

<UL>

<LI>Binary compatible with 32-bit x86 software

<LI>Bus frequency is 266 MHz. Because the processor uses Quad Data Rate bus the effective bus speed is 1066 MHz

<LI>Part HH80562PH0568M is an OEM processor

<LI>Part BX80562Q6600 is a boxed (retail) processor</LI></UL></TD></TR></TBODY></TABLE></td></tr></table>

[font="Verdana"]Keith (Kogmedia)[/font]My ScriptQuick Search - Internet / Hard Drive Search
Link to comment
Share on other sites

Hi,

Maybe something like that:

#include <Array.au3>

$HTML = FileRead("Source.txt")

$aHTML = StringRegExp($HTML, '(?i)(?s)<a href="(.*?)"', 3)

For $i = 0 To UBound($aHTML)-1
    $aHTML[$i] = StringFormat('<A href="%s">', $aHTML[$i])
Next

_ArrayDisplay($aHTML)

?

 

Spoiler

Using OS: Win 7 Professional, Using AutoIt Ver(s): 3.3.6.1 / 3.3.8.1

AutoIt_Rus_Community.png AutoIt Russian Community

My Work...

Spoiler

AutoIt_Icon_small.pngProjects: ATT - Application Translate Tool {new}| BlockIt - Block files & folders {new}| SIP - Selected Image Preview {new}| SISCABMAN - SciTE Abbreviations Manager {new}| AutoIt Path Switcher | AutoIt Menu for Opera! | YouTube Download Center! | Desktop Icons Restorator | Math Tasks | KeyBoard & Mouse Cleaner | CaptureIt - Capture Images Utility | CheckFileSize Program

AutoIt_Icon_small.pngUDFs: OnAutoItErrorRegister - Handle AutoIt critical errors {new}| AutoIt Syntax Highlight {new}| Opera Library! | Winamp Library | GetFolderToMenu | Custom_InputBox()! | _FileRun UDF | _CheckInput() UDF | _GUIInputSetOnlyNumbers() UDF | _FileGetValidName() UDF | _GUICtrlCreateRadioCBox UDF | _GuiCreateGrid() | _PathSplitByRegExp() | _GUICtrlListView_MoveItems - UDF | GUICtrlSetOnHover_UDF! | _ControlTab UDF! | _MouseSetOnEvent() UDF! | _ProcessListEx - UDF | GUICtrl_SetResizing - UDF! | Mod. for _IniString UDFs | _StringStripChars UDF | _ColorIsDarkShade UDF | _ColorConvertValue UDF | _GUICtrlTab_CoverBackground | CUI_App_UDF | _IncludeScripts UDF | _AutoIt3ExecuteCode | _DragList UDF | Mod. for _ListView_Progress | _ListView_SysLink | _GenerateRandomNumbers | _BlockInputEx | _IsPressedEx | OnAutoItExit Handler | _GUICtrlCreateTFLabel UDF | WinControlSetEvent UDF | Mod. for _DirGetSizeEx UDF
 
AutoIt_Icon_small.pngExamples: 
ScreenSaver Demo - Matrix included | Gui Drag Without pause the script | _WinAttach()! | Turn Off/On Monitor | ComboBox Handler Example | Mod. for "Thinking Box" | Cool "About" Box | TasksBar Imitation Demo

Like the Projects/UDFs/Examples? Please rate the topic (up-right corner of the post header: Rating AutoIt_Rating.gif)

* === My topics === *

==================================================
My_Userbar.gif
==================================================

 

 

 

AutoIt is simple, subtle, elegant. © AutoIt Team

Link to comment
Share on other sites

  • Moderators

#include <IE.au3>
#include <Array.au3>

$sURL = "http://www.cpu-world.com/CPUs/Core_2/Intel-Core%202%20Quad%20Q6600%20HH80562PH0568M%20(BX80562Q6600).html"
$oIE = _IECreate($sURL, 1)
$oLinks = _IELinkGetCollection($oIE)
$iNumLinks = @extended
Dim $aLinks[$iNumLinks]
For $i = 0 To $iNumLinks - 1
    $aLinks[$i] = $oLinks.item($i).outerHTML
Next
_ArrayDisplay($aLinks)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...