-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RTL multi-line text gets bottom-up #901
Comments
I believe we should add the text direction as a property of each fragment and perform the line break algorithm using the correct direction. While at it maybe we can improve our line break algorithm to make it compliant with the Unicode Line Breaking Algorithm. |
Fixed in #1096 |
done |
Hey @andersonhc I've upgrade to 2.7.8 but am still seeing this issue when using reshaping. I am using Google's NotoSans AR font. Repro:
|
Hi @andrewlutz, can you please remove the calls to arabic_reshaper and bidi.algorithm and try again? Just pass text instead of shaped_text in multicell |
I think the issue I'm having is that the string coming from our translation system is in LTR order, I need to run Does FPDF assume that the text is already in LTR order? |
just pass the unshaped text - if you are using set_text_shaping() fpdf will handle all the LTR-RTL conversion |
If you drop the string produced using just |
If you copy text from the PDF file and past elsewhere it won't work. We still need to implement marked-content tags to proper signalize to the client the original text order. |
Ah ok thanks for clarifying that. I'll check the final result with our internationalization team. |
@andersonhc I'm at a bit of a loss here, I checked with an Arabic speaking colleague and the feedback I received was that the text produced when using The text below was produced with text shaping disabled and with |
I found and fixed the bug with When you use get_display(reshape())) you transform the whole text into LTR and fpdf is not aware of the correct direction of each part of the text, so the line break is done as normal LTR text. |
Thanks! Will test this out later today |
Hey @andersonhc I tried this but am still getting the same bottom-up result. Were you able to verify with the repro I provided? I'm wondering if somehow the latest fpdf didn't get pulled in correctly. |
I installed the latest fpdf version from github master:
and ran the following python file: pdf = FPDF(orientation='P', unit='mm', format='A4')
pdf.add_font(family='STANDARD', style='', fname=PATH + 'NotoSans-Regular.ttf',)
pdf.add_font(family='NOTO_SANS_AR', style='', fname=PATH + 'NotoSansArabic-Regular.ttf',)
pdf.set_font('STANDARD')
pdf.set_fallback_fonts(['NOTO_SANS_AR'])
pdf.add_page()
pdf.set_text_shaping(use_shaping_engine=True, direction="rtl")
text = u'في Company X نعمل بجد لاتخاذ إجراءات ضد المحتوى الضار وغير القانوني. ونتيجة لذلك، قمنا بتقييد إمكانية الوصول إلى المنشور الذي قمتَ بـ تم إنشاؤه.'
pdf.set_font_size(8)
pdf.multi_cell(w=pdf.epw, text=text, align='R')
pdf.ln()
pdf.set_font_size(20)
pdf.multi_cell(w=pdf.epw, text=text, align='R')
pdf.output("issue-901.pdf") The result I got is attached. Visually it is similar to what you are getting in your last message and the one breaking on multiple lines is also visually similar. |
Glad to hear it's working for you, that result looks good. Likely an issue with importing on my end. When do you estimate a release with this fix will be ready? |
Hey @andersonhc are you able to do a release soon with the RTL fix? We'd really appreciate as it's currently a blocker for us :) |
When arabic text is placed in a multi-line context (
multi_cell()
,write()
, text regions, etc.), in addition to getting correctly printed right-to-left on each line, the actual lines end up in reverse order, ie. bottom-to-top.I've modified the relevant example from "test_text_shaping.py" to demonstrate:
Maybe we were a bit too eager to declare this topic solved... 🙄
Now in hindsight, this is not really surprising, given how our line wrapping is implemented.
For any rtl text, we really should start scanning from the back.
Unfortunately we can't just look at the current fragment in isolation, as there might be several rtl fragments in sequence, which we need to wrap as a whole.
Food for thought...
The text was updated successfully, but these errors were encountered: