Sign in to follow this  
Followers 0
the123punch

Extracting Data from a File

3 posts in this topic

Hi Guys,

Basically I have a file from which I need to extract some specific data.

In this case, I have the following data in my file:

CODE
Sequence databases

EMBLAF006084; AAB64189.1; -; mRNA.[EMBL / GenBank / DDBJ] [CoDingSequence]

AC004922; -; NOT_ANNOTATED_CDS; Genomic_DNA.[EMBL / GenBank / DDBJ]

BC002562; AAH02562.1; -; mRNA.[EMBL / GenBank / DDBJ] [CoDingSequence]

BC002988; AAH02988.2; -; mRNA.[EMBL / GenBank / DDBJ] [CoDingSequence]

BC007555; AAH07555.1; -; mRNA.[EMBL / GenBank / DDBJ] [CoDingSequence]

What I need to do is extract only the first data out of each line. To be more clear, if you look at the second line, which starts with EMBLAF006084.... I need to extract only the first data before the semi-colon. In the case of that line only I would not need the EMBL to be there.

For the other lines it is the same thing. I basically just need to extract the first entry which is before the first semi-colon.

My file is bigger then just that snippet of code but I didnt want to paste the whole thing.

Basically if there is a way of verifying each line and taking only the entry before it finds the first semi-colon I think it would solve my issue.

Can anyone help??

Thanks.

Share this post


Link to post
Share on other sites



FileReadLine

StringSplit - using ";" as a delimiter

the first element of the array returned by StringSplit $array[1] holds your string

do this to every line

:whistle:


SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script

wannabe "Unbeatable" Tic-Tac-Toe

Paper-Scissor-Rock ... try to beat it anyway :)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

I'd start doing some homework on the string functions, I'd use StringRegExp() myself.

Something like:

;$sString = FileRead('File.Location.Name')
$sString = 'Sequence databases' & @CRLF & _
    'EMBLAF006084; AAB64189.1; -; mRNA.[EMBL / GenBank / DDBJ] [CoDingSequence]' & @CRLF & _
    'AC004922; -; NOT_ANNOTATED_CDS; Genomic_DNA.[EMBL / GenBank / DDBJ]' & @CRLF & _
    'BC002562; AAH02562.1; -; mRNA.[EMBL / GenBank / DDBJ] [CoDingSequence]' & @CRLF & _
    'BC002988; AAH02988.2; -; mRNA.[EMBL / GenBank / DDBJ] [CoDingSequence]' & @CRLF & _
    'BC007555; AAH07555.1; -; mRNA.[EMBL / GenBank / DDBJ] [CoDingSequence]'
$aArray = StringRegExp($sString, '(?s)(?i)\n([a-z0-9]+);', 3)
_ArrayDisplay($aArray)
Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0