Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support adding line endings when a string containing \n is added to the document #1260

Open
1 of 5 tasks
Zt-freak opened this issue Feb 1, 2024 · 11 comments · May be fixed by #1346
Open
1 of 5 tasks

Support adding line endings when a string containing \n is added to the document #1260

Zt-freak opened this issue Feb 1, 2024 · 11 comments · May be fixed by #1346

Comments

@Zt-freak
Copy link

Zt-freak commented Feb 1, 2024

NPOI Version Used

2.6.2

File Type

  • XLSX
  • XLS
  • DOCX
  • XLSM
  • OTHER

Use Case

I want to be able to add line endings to a document, however every line ending \n in the string I want to add gets converted into a space.

I tested with both XWPFParagraph.ReplaceText() and XWPFRun.SetText(), both of them share this \n to spaces conversion behaviour.

Description

I got an example program which calls both XWPFParagraph.ReplaceText() and XWPFRun.SetText():

using NPOI.XWPF.UserModel;

using FileStream rs = File.OpenRead(@"test.docx");

using var doc = new XWPFDocument(rs);

XWPFParagraph firstPara = doc.Paragraphs.First();

firstPara.ReplaceText("{test}", "A\nB\nC\nD");

XWPFRun run = firstPara.CreateRun();
run.TextPosition = 8;
run.SetText("E\nF\nG\nH");

using var ws = File.Create(@"output.docx");
doc.Write(ws);

The document contains one paragraph containing the text:

{test}

The resulting document will contain the following text:

A B C D E F G H

While the desired result would be:

A
B
C
D
E
F
G
H
@Bykiev
Copy link
Collaborator

Bykiev commented Feb 21, 2024

Hi, you can use AddCarriageReturn() method this way:

string text = "E\nF\nG\nH";
var lines = text.Split("\n");

run.SetText(lines[0]);

for(int i = 1; i < lines.Length; i++)
{
    run.AddCarriageReturn();
    run.AppendText(lines[i]);
}

@Zt-freak
Copy link
Author

@Bykiev this solution using AddCarriageReturn appears to work, except within tables.

@Bykiev
Copy link
Collaborator

Bykiev commented Mar 29, 2024

@Zt-freak, you can use this code for tables:

  var table = doc.Tables.First();
  var row = table.GetRow(0);
  var cell = row.GetCell(0);
  foreach (var p in cell.Paragraphs)
  {
      foreach (var r in p.Runs)
      {
          if (!string.IsNullOrWhiteSpace(r.Text))
          {
              var lines2 = r.Text.Split(new string[] { "\\n" }, StringSplitOptions.None);

              r.SetText(lines2[0]);

              for (int i = 1; i < lines2.Length; i++)
              {
                  r.AddCarriageReturn();
                  r.AppendText(lines2[i]);
              }
          }
      }
  }

@Zt-freak
Copy link
Author

Zt-freak commented Apr 2, 2024

@Bykiev

@Zt-freak, you can use this code for tables:

  var table = doc.Tables.First();
  var row = table.GetRow(0);
  var cell = row.GetCell(0);
  foreach (var p in cell.Paragraphs)
  {
      foreach (var r in p.Runs)
      {
          if (!string.IsNullOrWhiteSpace(r.Text))
          {
              var lines2 = r.Text.Split(new string[] { "\\n" }, StringSplitOptions.None);

              r.SetText(lines2[0]);

              for (int i = 1; i < lines2.Length; i++)
              {
                  r.AddCarriageReturn();
                  r.AppendText(lines2[i]);
              }
          }
      }
  }

Sadly, this doesn't work on my end, it only removes the "\n"s from the paragraph

@Bykiev
Copy link
Collaborator

Bykiev commented Apr 2, 2024

Attache your code and file please

@Zt-freak
Copy link
Author

Zt-freak commented Apr 2, 2024

@Bykiev
Copy link
Collaborator

Bykiev commented Apr 2, 2024

@Zt-freak, the code looks good, but the document contains multiple runs in the paragraph. I've opened the document with WPS Office, selected the second cell and cleared the format, after this there is only 1 run. But with MS Word after clearing the format getting 10 runs. As it's not related to NPOI, the best way will be concat multiple runs in paragraph in one string and then split it with new lines symbol.

@tonyqus
Copy link
Member

tonyqus commented Apr 2, 2024

@Bykiev Please ignore the behavior of WPS Office. It's usually different from Microsoft Word.

@pbvs
Copy link

pbvs commented May 1, 2024

Hi @tonyqus , @Bykiev , @Zt-freak ,

I was looking into this and I was wondering if I could suggest an alternate solution to the problem. Basically our solution works as follows, we define a template in a word document and then merge it with a dataset to get the final document. So we define a set of placeholders in word which the software then replaces with the action value from the data set. This means that our software solution is basically one big search and replace exercise.
So for example I in my word document would like:
image

The software would then have to replace the text “$replace_text$” and “$replace_cell_text$” with a given text string, in my example I want to replace the tag with the value “Regel1\nRegel2\nRegel3”

I created a small unit test:

image

If i run a unittest npoi generates the following output
image

As suggested by @Bykiev we would need to break each part up into a separate run, but turns out to be rather difficult. if i create a word document and add three lines and then look at the xml i see that word has created a run for each line.

image

I checked the openxml documentation (Office Open XML (OOXML) - Word Processing - Text) and an enter can also be encoded in xml as “<w:cr/>”. So i did some digging.

When w:t is written to xml this is done in the class “wml.cs” in the function “Write”. What if we replace the “\n” in this function with an <w:cr/> element? I made the following modification:
image

If run my unit test with this modification i do get the result i am looking for:

image

The xml now looks like for the top part

image

And the table cell looks like:
image

I would like to hear your thoughts on the matter and change i propose.

@Bykiev
Copy link
Collaborator

Bykiev commented May 4, 2024

@pbvs, look good, would you like to contribute?

@tonyqus tonyqus added the docx label May 5, 2024
@pbvs
Copy link

pbvs commented May 6, 2024

@Bykiev Sure no problem, I will create the change as described above, do some additional testing and create a pull request when finished.

@pbvs pbvs linked a pull request May 14, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants