154
Views
14
Comments
Solved
Byte size of text
Question

Is there any way to count the byte size of text without using Binary API ?

2020-09-21 08-42-47
Vincent Koning
Solution

@tuyenhx 

You could create an component via Integration Studio that tells you this. The following C# code should do the trick;

// Calculate byte size of string in UTF-16
int byteSize = Encoding.Unicode.GetByteCount(text); 

UserImage.jpg
Ron Ben

Can I do it in Service Studio as well? 

2020-09-21 08-42-47
Vincent Koning

As far as I know you need to create this component yourself via Integration Studio or check the Forge if someone has done this and published it so you can consume it. There are no native actions in OutSystems that can perform this action.

2021-06-03 11-03-21
Anubhav Rai

Hi @tuyenhx ,
You can check this Link.
Here you can get the file size with the help of JS.

Regards,
Anubhav

2026-01-03 13-44-38
Erwin van Rijsewijk
Champion

To get the "size" of a text, you can use Length()

2020-09-21 08-42-47
Vincent Koning

This is not true. The byte size of a string/text is possible different then the length.

As ripped from stackoverflow;

Nope. A zero terminated string has one extra byte. A pascal string (the Delphi shortstring) has an extra byte for the length. And unicode strings has more than one byte per character.

By unicode it depends on the encoding. It could be 2 or 4 bytes per character or even a mix of 1,2 and 4 bytes.

Since I can store unicode strings in a Text I presume that we use 2 bytes per character but I could not find any documentation about the real-life implementation of strings in OutSystems. So other then using the TextToBinary action I do not know how to get the byte size. But even then I am not sure if this will truly represent the actual Byte Size because I'm not sure if the extra metadata from the string is removed with this action. 

@tuyenhx : Why do you need this?


2020-09-15 13-07-23
Kilian Hekhuis
 
MVP

Both C# and JavaScript use UTF-16. So it's 2 bytes per character for anything but the most outlandish ones (don't know about emojis, those could be 4).

2020-09-21 08-42-47
Vincent Koning

I found this about UTF-16;

In UTF-16, a single character takes either 2 bytes or 4 bytes, depending on whether the character falls within the Basic Multilingual Plane (BMP) or outside the BMP, respectively. Characters within the BMP, such as most commonly used alphabets and symbols, are represented using 2 bytes. Characters outside the BMP, such as emoji and certain uncommon characters, are represented using 4 bytes. 


All in all, the Length of a string will surely not tell the byte size of a string.

UserImage.jpg
tuyenhx

i am rebuilding an application and it required to count a byte of japanese text, so i'm trying to do it without using API, because asking for permission to use API is quite hard

2020-09-21 08-42-47
Vincent Koning

Would my suggestion below (C# code) be a solution for you?

UserImage.jpg
Ron Ben

Vincent Koning,

Would you recommend just  doing it on JavaScript?

2020-09-21 08-42-47
Vincent Koning

Doesn't really matter. You can see how to that here: https://stackoverflow.com/questions/2219526/how-many-bytes-in-a-javascript-string

But also please read the article @Kilian Hekhuis posted yesterday. It is a great read and show exactly the troubles of string length in UTF. Since I have no idea on how to reference a post in a thread on this forum I put the link here: https://tonsky.me/blog/unicode/


2020-09-21 08-42-47
Vincent Koning
Solution

@tuyenhx 

You could create an component via Integration Studio that tells you this. The following C# code should do the trick;

// Calculate byte size of string in UTF-16
int byteSize = Encoding.Unicode.GetByteCount(text); 

UserImage.jpg
Ron Ben

Can I do it in Service Studio as well? 

2020-09-21 08-42-47
Vincent Koning

As far as I know you need to create this component yourself via Integration Studio or check the Forge if someone has done this and published it so you can consume it. There are no native actions in OutSystems that can perform this action.

2018-03-19 17-02-52
Peter Hendrix

Vincent Koning is right. 

This post shows the difference between software engineers and citizen developers 😊

(With utmost respect for all of them of course)

2020-09-15 13-07-23
Kilian Hekhuis
 
MVP

For those wanting to understand the underlying mechanisms, see this excellent post.

Community GuidelinesBe kind and respectful, give credit to the original source of content, and search for duplicates before posting.