Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some cargo.toml gets scrambled when runing cargo upgrade #218

Closed
Eijebong opened this issue Jul 19, 2018 · 5 comments
Closed

Some cargo.toml gets scrambled when runing cargo upgrade #218

Eijebong opened this issue Jul 19, 2018 · 5 comments

Comments

@Eijebong
Copy link
Contributor

I ran cargo upgrade parking_lot --all

@ordian
Copy link
Collaborator

ordian commented Jul 19, 2018

hey @Eijebong, yes it's a known limitation of the underlying toml library we currently use in cargo-edit. It would be fixed when/if we switch to tom.

@Eijebong
Copy link
Contributor Author

Sure, I spoke to @killercup who told me to open this to check that the switch would fix these cases

@alsuren
Copy link
Contributor

alsuren commented Jan 19, 2019

I was just talking to the maintainer of combine (which is used by cargo_edit) on gitter, and I reckon that if we store a list of start+end positions for each table in the tree then we could serialise unmodified tables/table-fragments back in their original form+position. This could potentially be done without replacing cargo-edit.

You'd still have this class of bug if you try to edit a fragmented table, but it should be possible to emit a warning (or flat-out fail and require a --regroup flag) in that case, because needing to programmatically edit such tables would be very rare.

Does this sound like a sensible idea?

@alsuren
Copy link
Contributor

alsuren commented Jun 3, 2019

While I was busy procrastinating actually starting on the above plan, I cloned rust-lang/rust-repos and went hunting for Cargo.toml examples on github. I found 80164 Cargo.toml files in the roots of git repos. I cleaned up DOS line endings and missing final newlines manually.

I then used cargo add to add a dependency.

Of the 80164 samples I found:

  • 2310 were virtual manifests
  • 54 were invalid syntax
  • 69455 (87%) just added a single line (as expected)
  • 6889 (9%) also added a [dependencies] section (still fine)
  • 1457 (2%) also did something else (mostly moving sections of the file around, but ~10 also made whitespace changes, like removing spaces from section headers)

I've dumped a list of links to the 1457 files at https://raw.githubusercontent.com/alsuren/rust-repos/cargo.toml-samples/largest-diffs-last.urls . The files with the smallest diff are at the top, and the largest diffs are at the bottom. The use-case that regularly produces large diffs is having a [package.metadata.deb] or [package.metadata.release] or [package.metadata.bootimage] section at the bottom of the file.

I have a day off on Wednesday, so I might have a go at tackling this then. Alternatively I could tackle #217 first, because it's probably easier, even though I only found 191 files where there are more instances of = *' than = *".

bors bot added a commit to toml-rs/toml that referenced this issue Jun 10, 2019
60: Replace Cell and RefCell usage in display.rs r=ordian a=alsuren

I also split the implementation into a helper method that can walk a tree of tables, and a callback that does the formatting.

Currently, I use a closure to pass the Formatter into the callback, which feels a bit naughty. A cleaner implementation might be to implement the full visitor pattern with a struct, but I get in trouble with lifetimes whenever I store a reference to the Formatter anywhere.

The eventual plan is to create `Table::visit_nested_tables_in_original_order()` and `Document::to_string_in_original_order()` (or similar), to solve killercup/cargo-edit#218 . That PR is still a bit of a mess though, so I've extracted this bit to get the ball rolling.

Tell me what you think.

Co-authored-by: David Laban <alsuren@gmail.com>
bors bot added a commit to toml-rs/toml that referenced this issue Jul 1, 2019
63: Avoid reordering tables r=ordian a=alsuren

Adds Document::to_string_in_original_order()

This is intended to help with killercup/cargo-edit#218
    
We save .position = Some(i) on each Table, starting with i = 0 for
the root of the Document, and then incrementing each time we see a
table header.

When serialising, ~we start looking for a Table with position 0,
which we should find at the root of the document. We then continue
scanning, looking for position 1 and so on. If the document doesn't
need re-ordering then we will on need to do a single scan.~ we start by 
doing a walk over the tree and working out which positions exist
(storing them in a Vec and then sorting them). We then do subsequent
walks to find each of the known positions in turn. Each walk of the tree
is guaranteed to yield at least one table, but will probably yield more.
If the document doesn't need re-ordering then we will on need to do a single scan.

If a Table has been added using the API rather than parsed then it
will have position = None. We only want to serialise each Table
once, so we only print tables with position = None if the table
we visited immediately before it had the right position.

If we didn't manage to serialise all Tables after a single loop
then we keep looping until we are done. ~There is code to make sure
we don't get stuck when someone causes a gap in the numbers by
deleting a Table.~ It is impossible for the loop to get stuck, because
it only searches for position values that it already knows are in the tree.

Known issues:
- [x] ~If you have created your Document by parsing two .toml files and
  merging the results together, this method may silently skip some
  tables. Please use Document.to_string() instead, which doesn't have
  this problem.~ fixed
- [ ] The best case performance of this function is a bit worse than
  Document.to_string(). It will be significantly worse if you have lots of
  tables that are in strange orders.


Also adds `pretty_assertions` in some tests and color-backtrace (as separate commits) because I found debugging test failures more tricky without them.

Things to do before merging:
- [x] make a version of cargo-edit which uses this patch, and make sure that it actually works on my test samples at https://github.com/alsuren/rust-repos/tree/cargo.toml-samples
  - see killercup/cargo-edit#312
  - went from 1457 funky files down to 13 (and I think that some of them are because I didn't clean up the unix/windows line endings correctly)
- [x] Write a few more integration tests, describing interesting edge cases from the repo. For example:
  - [x] An example with trailing comments after the end of the document.
  - [x] An example where `[package.metadata.deb]` is at the bottom of the file and it stays there.
    * The table_reordering test already does this.
  - [x] A test with the following document, where you add `[dependencies.somethingelse]` and it is added after `[dependencies.opencl]` rather than before `[[example]]`.
```
[package]
[dependencies]
[[example]]
[dependencies.opencl]
[dev-dependencies]
```
- [x] Is the API okay? Display.to_string() seems to swallow the error case (possibly by panicking). Should we do the same? (answer: yes)
- [x] (edit: added a commit to use this new algorithm) Is it enough to document the issue where it can silently skip tables that are parsed from multiple files, or should I check for it? If I want to fix it properly then the best way to do it is to:
  * walk the tree once, gathering all of the known `position` numbers unto a vec
  * Sort the positions vec
  * Walk the tree and pop off items from the positions vec as I go (rather than incrementing an integer)
  * Stop looping when the vec is empty.
  * This is probably simpler than my current implementation, but forces me to allocate another vec, and then do at least two passes of the tree, even if the tables are in the correct order already.
- [x] update the limitations section of the readme to explain that you should use Document::to_string_in_original_order() if you care about not forcing nested tables to be grouped next to their parents.
  - [ ] Once this PR is merged, make another PR to point `https://github.com/ordian/toml_edit/blob/f09bd5d075fdb7d2ef8d9bb3270a34506c276753/tests/test_valid.rs#L84` at the updated test.

Co-authored-by: David Laban <alsuren@gmail.com>
bors bot added a commit that referenced this issue Jul 20, 2019
312: Use toml_edit's to_string_in_original_order() r=ordian a=alsuren

uses toml_edit from this PR: toml-rs/toml#63

solves #218 pretty convincingly 

previously it was:
> 1457 (2%) also did something else (mostly moving sections of the file around, but ~10 also made whitespace changes, like removing spaces from section headers)

Now it is:
> 13 (0.02%) also did something else (mostly removing whitespace from section headers but a handful also fixed unix/windows line endings that I failed to clean up properly before starting).

Co-authored-by: David Laban <alsuren@gmail.com>
@ordian
Copy link
Collaborator

ordian commented Oct 27, 2019

I believe this is resolved in #312. Feel free to reopen if it's not the case.

@ordian ordian closed this as completed Oct 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants