Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excel workbooks needing to be recovered #354

Open
dwedigital opened this issue Feb 26, 2024 · 4 comments
Open

Excel workbooks needing to be recovered #354

dwedigital opened this issue Feb 26, 2024 · 4 comments

Comments

@dwedigital
Copy link

Any excel workbook I create with a basic worksheet is needing to be recovered. I have unzipped the xlsx workbook before repair and after repair (see below for the differences between the XML after repair).

Is there any issue with out of date schema being used or something obvious I am missing? Once I allow the file to be repaired and save it,, all is OK but would be good to not have this issue in the first place.

Before repair

<?xml version="1.0" encoding="UTF-8"?><worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xml:space="preserve"><sheetPr ><pageSetUpPr fitToPage="0" /></sheetPr><dimension ref="A1:D8"></dimension><sheetViews><sheetView defaultGridColor="1" rightToLeft="0" showFormulas="0" showGridLines="1" showRowColHeaders="1" showRuler="1" showWhiteSpace="0" showZeros="1" tabSelected="0" windowProtection="0" showOutlineSymbols="0" zoomScaleSheetLayoutView="0" zoomScalePageLayoutView="0" zoomScaleNormal="0" workbookViewId="0" zoomScale="100" ></sheetView></sheetViews><sheetFormatPr baseColWidth="8" defaultRowHeight="18" /><cols><col width="8.8" min="1" max="1" bestFit="1" customWidth="1" /><col width="30.800000000000004" min="2" max="2" bestFit="1" customWidth="1" /><col width="15.400000000000002" min="3" max="3" bestFit="1" customWidth="1" /><col width="30.800000000000004" min="4" max="4" bestFit="1" customWidth="1" /></cols><sheetData><row r="1" ><c r="A1" s="0" t="s"><v>0</v></c><c r="B1" s="0" t="s"><v>1</v></c><c r="C1" s="0" t="s"><v>2</v></c><c r="D1" s="0" t="s"><v>3</v></c></row><row r="2" ><c r="A2" s="0" t="n"><v>9798</v></c><c r="B2" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c><c r="C2" s="0" t="s"><v>4</v></c><c r="D2" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c></row><row r="3" ><c r="A3" s="0" t="n"><v>9799</v></c><c r="B3" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c><c r="C3" s="0" t="s"><v>5</v></c><c r="D3" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c></row><row r="4" ><c r="A4" s="0" t="n"><v>9800</v></c><c r="B4" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c><c r="C4" s="0" t="s"><v>6</v></c><c r="D4" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c></row><row r="5" ><c r="A5" s="0" t="n"><v>9801</v></c><c r="B5" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c><c r="C5" s="0" t="s"><v>7</v></c><c r="D5" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c></row><row r="6" ><c r="A6" s="0" t="n"><v>9802</v></c><c r="B6" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c><c r="C6" s="0" t="s"><v>8</v></c><c r="D6" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c></row><row r="7" ><c r="A7" s="0" t="n"><v>9803</v></c><c r="B7" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c><c r="C7" s="0" t="s"><v>9</v></c><c r="D7" s="0" t="d"><v>2021-12-06T14:28:07+00:00</v></c></row><row r="8" ><c r="A8" s="0" t="n"><v>10518</v></c><c r="B8" s="0" t="d"><v>2022-01-18T13:17:20+00:00</v></c><c r="C8" s="0" t="s"><v>10</v></c><c r="D8" s="0" t="d"><v>2022-01-18T13:17:20+00:00</v></c></row></sheetData><sheetCalcPr fullCalcOnLoad="1" /><printOptions gridLines="0" headings="0" horizontalCentered="0" verticalCentered="0" /><pageMargins left="0.75" right="0.75" top="1.0" bottom="1.0" header="0.5" footer="0.5" /><pageSetup /><headerFooter ></headerFooter></worksheet>

After repair

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac xr xr2 xr3" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" xr:uid="{00000000-0001-0000-0000-000000000000}"><dimension ref="A1:D8"/><sheetViews><sheetView tabSelected="1" showOutlineSymbols="0" showWhiteSpace="0" workbookViewId="0"/></sheetViews><sheetFormatPr baseColWidth="10" defaultColWidth="8.83203125" defaultRowHeight="14" x14ac:dyDescent="0.15"/><cols><col min="1" max="1" width="8.83203125" bestFit="1" customWidth="1"/><col min="2" max="2" width="30.83203125" bestFit="1" customWidth="1"/><col min="3" max="3" width="15.33203125" bestFit="1" customWidth="1"/><col min="4" max="4" width="30.83203125" bestFit="1" customWidth="1"/></cols><sheetData><row r="1" spans="1:4" x14ac:dyDescent="0.15"><c r="A1" t="s"><v>0</v></c><c r="B1" t="s"><v>1</v></c><c r="C1" t="s"><v>2</v></c><c r="D1" t="s"><v>3</v></c></row><row r="2" spans="1:4" x14ac:dyDescent="0.15"><c r="A2"><v>9798</v></c><c r="B2" t="s"><v>11</v></c><c r="C2" t="s"><v>4</v></c><c r="D2" t="s"><v>11</v></c></row><row r="3" spans="1:4" x14ac:dyDescent="0.15"><c r="A3"><v>9799</v></c><c r="B3" t="s"><v>11</v></c><c r="C3" t="s"><v>5</v></c><c r="D3" t="s"><v>11</v></c></row><row r="4" spans="1:4" x14ac:dyDescent="0.15"><c r="A4"><v>9800</v></c><c r="B4" t="s"><v>11</v></c><c r="C4" t="s"><v>6</v></c><c r="D4" t="s"><v>11</v></c></row><row r="5" spans="1:4" x14ac:dyDescent="0.15"><c r="A5"><v>9801</v></c><c r="B5" t="s"><v>11</v></c><c r="C5" t="s"><v>7</v></c><c r="D5" t="s"><v>11</v></c></row><row r="6" spans="1:4" x14ac:dyDescent="0.15"><c r="A6"><v>9802</v></c><c r="B6" t="s"><v>11</v></c><c r="C6" t="s"><v>8</v></c><c r="D6" t="s"><v>11</v></c></row><row r="7" spans="1:4" x14ac:dyDescent="0.15"><c r="A7"><v>9803</v></c><c r="B7" t="s"><v>11</v></c><c r="C7" t="s"><v>9</v></c><c r="D7" t="s"><v>11</v></c></row><row r="8" spans="1:4" x14ac:dyDescent="0.15"><c r="A8"><v>10518</v></c><c r="B8" t="s"><v>12</v></c><c r="C8" t="s"><v>10</v></c><c r="D8" t="s"><v>12</v></c></row></sheetData><pageMargins left="0.75" right="0.75" top="1" bottom="1" header="0.5" footer="0.5"/><pageSetup orientation="portrait"/></worksheet>
@straydogstudio
Copy link
Contributor

@dwedigital Can you confirm which caxlsx you are using, and which ruby?

@dwedigital
Copy link
Author

caxlsx (4.0.0)
ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [arm64-darwin21]

@tagliala
Copy link
Contributor

Is it possible to have a reproducible test case?

require 'axlsx'

p = Axlsx::Package.new
wb = p.workbook

puts "Ruby version: #{RUBY_VERSION}"
puts "Caxlsx version: #{Axlsx::VERSION}"

wb.add_worksheet(name: 'GH Issue 354 Test Case') do |sheet|
  # TODO: minimum amount of code to reproduce
end

p.serialize 'gh_issue_354.xlsx'

@kiskoza
Copy link
Member

kiskoza commented Apr 11, 2024

tl;dr: It looks like Excel has a different opinion on how an ISO 8601 format works.


I've checked the diff between the two xmls and found that Excel reduced the cell definitions from

<c r="B6" s="0" t="d">
  <v>2021-12-06T14:28:07+00:00</v>
</c>

to be only

<c r="B6" t="s">
  <v>11</v>
</c>

I'm not sure why it was replaced with the value 11, I was able to reproduce the error messages, but not this detail, all my trials kept the date as a string in the end.

My latest code looks like this:

wb.add_worksheet(name: 'GH Issue 354 Test Case') do |sheet|
  values = [
    DateTime.new(2021, 12, 06, 14, 28, 7).iso8601, # breaks Excel
    '2021-12-06T14:28:07', # works as expected
    '2021-12-06T14:28:07+01:00', # breaks Excel
  ]

  values.each do |value|
    sheet.add_row [value, value.match?(Axlsx::ISO_8601_REGEX)] # add `types: [:iso_8601, :boolean]` to be more explicit
  end
end

The example has three values I try to insert into a cell as an iso8601 formatted value. All three got interpreted as iso_8601 cell type in the gem (the regexp matches and caxlsx inserts t="d" attribute to the cell), but only the second works in Excel.

I would change the ISO_8601_REGEX to match the definition of iso formats used in Excel, we might need to find the exact definition. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants