103
Views
7
Comments
Best Way To Check Duplicates Within Text
Question
Application Type
Reactive

Hi ; i am wondering if there is any good method to check for duplicate within text

User will upload an excel file and i am to check for duplicate of 2 characters text within a record

Example In column B user have enter this record ARAWARA1BH

I will need to break it down into 

AR , AW, AR, A1, BH and if there is a duplicate i will print in column C Duplicate found AR

As the value capture in column B does not contain any delimiters i am not able to use string_function

The Excel file can contain up to few thousand records therefore i need to find a efficient methods to quickly find the duplicate to prevent a time out

Thank you in advance for the help

UserImage.jpg
Kay Lun

Hi Jerah,

The first thing I come with would be using if() to do a condition loop, then substr() the column B and append into a string List.

for example:

If(currentIndex < Length(columnB) / 2) -> substr(columnB, currentIndex, 2) -> currentIndex = currentIndex + 2 -> ListAppend(StringList, substr.result)

the number 2 is the length of how many characters you wanted to split


After that, using the ListDistinct function to find the duplicated value.

Remember the ListDistinct function is not used by default, you need to go to the dependencies and find (System), and search with keyword List, then you should be able to see the ListDistinct function there, then tick the checkbox to start using it :)

Hope this help.

2021-06-13 07-48-30
Jerah

Thank you for your help , i will go and try it out your input :)

UserImage.jpg
Kay Lun

I'm sure you will get what you need :)

Cheers.

2021-06-13 07-48-30
Jerah

I got to this point , for ListDistinct function does it mean that if there is a duplicated it will be reflected in Current ?

2021-06-13 07-48-30
Jerah

I repeated the process with no duplicate but it still show in current 

UserImage.jpg
Kay Lun

According to the description of the function, it should return a list that remove all duplicated values, which means, the length of the distincted list will be different compare to the original list that contain all the splits.

So you could compare the length of two list to see if they're the same, if not then that means the columnB contains duplicated value(s).

I know it sounds too much afford just to check the duplicated, but this is what I come with first, so might be there's another solution could reduce the steps haha.

Hope this help :)

Cheers.

UserImage.jpg
Kay Lun

Read again your first post, I forgot you need to find the duplicated value, as I mention in the last comment, you have two list on your hand

  1.  original list contains all the splits 
  2.  distinct list.

So you first compare the length of them, to see if there's duplicated value first, if yes, then you could loop through the distinct list, then using the ListIndexOf and ListRemove to remove the value in list 1, now you should be able to find the value remain, which will be the duplicated value.


Community GuidelinesBe kind and respectful, give credit to the original source of content, and search for duplicates before posting.