Searching in textfile and extracting text

Searching in textfile and extracting text

  

Hi,

What's the best approach in OS to process textfiles and search for text in it. I have to process mail messages where I want to get content of a certain tag (and insert the content in an entity): 

<span class="telno">1234567890</span>

Regards, Harry

Solution

Hi Harry,

You can use the built-in Index function of course, to search for a specific text. If you want slightly more control over what to search for (e.g. with wildcards), take a look at Regex_Search from the Text extension.

Solution

Hi Harry,

You can check FileSystem component to read the text from file, Convert it to binary data and then perfrom the regex_seach

https://www.outsystems.com/forge/component/68/filesystem/

Sachin

Hi Sachin,

Regex_Search works on Text, not Binary data!

Hi Harry,

If your file has got more than one instances (variable) then you need to split the text file with new line delimiter(depends on what the delimiter is) and  then do for each loop for the list obtained. Inside the loop search the text via Regex_Search and then insert the value in database.

Again that depends on how is your text file and how many instances of the text to be searched will be there.

 Overall, the main function Regex_Search will do your work :)

Hi Killian, Thanks for correcting me, Have to read the text from file and the we can  perform the regex_seach on text return by the function.

Sachin


Debasis Sahoo wrote:

If your file has got more than one instances (variable) then you need to split the text file with new line delimiter

I'm not sure why you think that. When using Index, you can adjust the start, and when using Regex_Search, you can use Substr to search the remainder. No need to split it, especially because you could also have more than one per line.

You are correct Killian, I was more focused on getting the text one per line. But you are correct for the possibility on having it more than once per line.

Your method is better way of achieving it. 

Cheers :)

Hi,

Ok, thank you all.  I  am trying this (search the first position) now using

 Index(File_ReadText.Text,"<span class="tel">") 

but because of the " " in the text to search it does not work. I tried putting ' ' around it but alas... Any idea?

Harry   

 

Hi Harry,

Try two double quotes:

Index(File_ReadText.Text,"<span class=""tel"">") 

Cheers,

José

José,


that worked. Thank you. 

 

Regards, Harry