Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji brings down hello-world example #569

Open
TamaMcGlinn opened this issue Jun 12, 2021 · 2 comments
Open

Emoji brings down hello-world example #569

TamaMcGlinn opened this issue Jun 12, 2021 · 2 comments

Comments

@TamaMcGlinn
Copy link

If you put this program into the interactive hello-world compiler box, it apparently crashes. Changing the program again seems to still work, so I'm not sure this is really a problem.

with Ada.Text_IO; use Ada.Text_IO;

procedure Greet is
   Message : String (1 .. 4) := "😊";
   --        ^ Pre-defined array type.
   --          Component type is Character
begin
   for I in reverse Message'Range loop
      --    ^ Iterate in reverse order
      Put (Message (I));
   end loop;
   New_Line;
end Greet;

The error message is:

$ ./greet
The machine running the examples may not be available or is busy, please try again now or come back later.

Screenshot from 2021-06-12 11-22-17

@gusthoff
Copy link
Collaborator

This is an interesting corner case. Your example makes celery crash in the backend:

celery.backends.base.UnicodeDecodeError: ('Traceback (most recent call last):', '  File "/vagrant/app/widget/tasks.py", line 38, in run_program', '    code, out = project.run()', '  File "/vagrant/app/widget/project.py", line 307, in run', '    code, out, err = self.container.execute(line, reporter=rep)', '  File "/vagrant/app/widget/container.py", line 149, in execute', '    user)', '  File "/vagrant/app/widget/container.py", line 118, in _stream_exec', '    temp = stdout.decode("utf-8")', "UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8a in position 0: invalid start byte", '')

The first issue with your code example is that you're using reverse in the loop, which invalidates the original UTF-8 string. This is indicated in the error message above.

However, even when removing the reverse keyword, the fact that you're calling Put for each character still makes celery crash:

celery.backends.base.UnicodeDecodeError: ('Traceback (most recent call last):', '  File "/vagrant/app/widget/tasks.py", line 38, in run_program', '    code, out = project.run()', '  File "/vagrant/app/widget/project.py", line 307, in run', '    code, out, err = self.container.execute(line, reporter=rep)', '  File "/vagrant/app/widget/container.py", line 149, in execute', '    user)', '  File "/vagrant/app/widget/container.py", line 118, in _stream_exec', '    temp = stdout.decode("utf-8")', "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 0: unexpected end of data", '')

In this case, celery most probably processes each character as a separate string. Of course, such a string would be incomplete, as it doesn't have a complete sequence in UTF-8 format. This is what might be causing the error message above.

@gusthoff
Copy link
Collaborator

gusthoff commented Aug 27, 2021

Note that the correct version of the code example works fine:

with Ada.Text_IO;              use Ada.Text_IO;
with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding;

procedure Greet is
   Message : constant UTF_8_String := "😊";
begin
   Put_Line (Message);
end Greet;

Also, if you assume that all strings are in UTF-8 format and don't need to be processed, you could just write this instead:

with Ada.Text_IO; use Ada.Text_IO;

procedure Greet is
   Message : constant String := "😊";
begin
   Put_Line (Message);
end Greet;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants