96
Views
8
Comments
Scrub HTML/JS from text entry
Question

This is a seemingly obvious question and if there is an obvious answer, I apologise.

When entering text into a textarea box to prevent HTML/JS injection I want to run a regexp against the entry, if a script or html-esque content is found I want to remove it - all.

or at least prevent the text from being submitted.


I have started by checking for 

    <script>

and 

</script> tags using Regex_Search and Regex_Replace;


There are two input and two putput params, Regex_Needle, Text_Haystack{both strings}  [inputparams] and result {boolean}, CleanedText {String} [output params]

the action checks if a pattern has been sent across if none has it settles on the default:

Default pattern:

<\/*?script\s*>\W[<\/*?script\s*>]*?

So RegexNeedle is the pattern,

TextHaystack is the target, if the value has been found regex_replace should run against this, 


the two assigns underneath are as follows:


False:

result: false

CleanedText : TextHaystack


True

result: true

CleanedText:  Regex_Replace.Result.

So the returned 'object' should contain two attributes, A boolean for the result and the text, but nothing comes back

I have tested this pattern again and again and it should return false n the first found pattern, but its not striping the text out even when its found, 


Can anyone see what I am doing incorrectly?



All help is very much appreciated.


Thanks,


Jim




2020-09-15 13-07-23
Kilian Hekhuis
 
MVP

Hi Jim,

It would be easier if you could include an eSpace with said code, it'll be much easier to see what's going wrong.

2019-10-01 17-16-03
Jim Crawford

I would love to provide the espace, however most of the code is proprietary to my company and I would be fired in a second if I shared it... thats jwhy Im trying to limit exactly what I am doing to the function itself, when debugging I see the data flows correctly it just doesnt seem to like the pattern.  Im just going to push through I suppose :)


2020-09-15 13-07-23
Kilian Hekhuis
 
MVP

It would be fairly easy to just isolate the one Action you showed above?

2019-10-01 17-16-03
Jim Crawford

Kilian Hekhuis wrote:

It would be fairly easy to just isolate the one Action you showed above?


I just dont know enough about Outsystems to export a single action, Ive tried lookinjg in the 'save as' but all I can find is clone or export the full thing :(


2020-09-15 13-07-23
Kilian Hekhuis
 
MVP

You can create a new eSpace, then copy/paste the action over.

2019-10-01 17-16-03
Jim Crawford

try this, its in Server Actions/OS 11

examples.oml
2020-09-15 13-07-23
Kilian Hekhuis
 
MVP

Thanks. There's nothing obviously wrong with the way it's handled, though the regular expression itself is not very sound. For example:

<script src="https://malware.com/nastyscript.js">

Is a perfectly valid script tag, and won't be detected since your regex only tests for spaces after "script". Also, [] are used to provide a range, but you have the entire closing tag in there. Thirdly, with this setup (correcting the []), if there are multiple occurences, everything between the first opening and last closing tag will be removed, not just what's in between the tags.

So I'd advise you to use Google, which will lead you to answers like this or this, both of which will tell you you don't want to use regex on HTML.

Lastly, I don't understand why you assign the integer 1 to cleanedText in the Found = True branch.


2019-10-01 17-16-03
Jim Crawford

Thanks,


I will do that, Integer on was a mistake as I was copying over, I noticed only when I had exported it! it was supposed to be Regex_search.result!

Community GuidelinesBe kind and respectful, give credit to the original source of content, and search for duplicates before posting.