Skip to content

cmp argument parser rewrite #177

@GunterSchmidt

Description

@GunterSchmidt

Hello everyone,

I was checking the functionality of the cmp.rs and found some issues around not working options/arguments.

I did a lot of testing regarding the parser and found the following rules.

    /// Parses the command line arguments. \
    /// Since cmp is called from diffutils, the first argument must always be "cmp".
    ///
    /// The following checks require more extensive checks than a simple compare.
    /// These are all identical and make parsing extensive:
    /// - cmp file_1 file_2 -b -l -n 50
    /// - cmp file_1 file_2 -b -l -n50
    /// - cmp file_1 file_2 -bl -n50
    /// - cmp file_1 file_2 -bln 50
    /// - cmp file_1 file_2 -bln50
    /// - cmp file_1 file_2 --print-bytes --verbose --bytes 50
    /// - cmp file_1 file_2 --print-bytes --verbose --bytes=50
    /// - cmp file_1 file_2 --p --verb --by 50
    /// - cmp file_1 file_2 --p --verb --by=50
    /// 
    /// and many more rules.

A number of them are not working in the current cmp.rs version.

After I started to fix some of the issues, I started refactoring the whole parser.

  • Separation of concerns:
    • I moved all code regarding parsing into to Params struct.
    • The parser now only returns enums (Params or InfoText for Version and Help) for the OK case or a specified Error.
    • All errors have been moved into an Error Enum ParamsParseError which handles all output in Display.
    • This allows a centralized maintenance of errors and error messages. The caller can work with the error enum more easily.
  • Renaming of the Param properties.
    • from and to replaced with file_1 and file_2, as these are the parameters.
    • max_bytes with bytes_limit
    • quiet with silent as s is the short code for it
    • skip_a with ignore_initial_bytes_file_1, so there is a connection between options and the property name
  • The bytes variables where size usize, thus limiting the readable bytes on 32-bit systems. GNU cmp is compliles with
    Large File Support and allows i64 values. I changed it to u64 with a Feature to change it to u128.
  • bytes-limit: The tests set the bytes-limit to MAX:usize if a very large number is encountered. This is incorrect,
    GNU cmp gives an error message.
  • Corrected parser, e.g. -bln50KiB is now allowed or --ig for --ignore-initial
  • Correct output of error messages, Help text is missing. New error messages like in GNU cmp.
  • Added text for --version and --help
  • Tests
    • Added new tests for the above valid options.
    • Changed tests where bytes where set to MAX.
    • Tests not can be written with simple String "cmp foo bar -n 50"

Questions:

  • Params executable: What is the reason to store this value? It is the argument 'cmp' which is used to start the program.
    I don't see how this can ever change and it requires a lot of extra code to pass it to the error messages.
    const APP_NAME: &str = 'cmp'; would be sufficient, or am I missing something here? I would like to remove it.
  • GNU cmp behaves very odd when passing '--'. Try 'cmp -- file_1 --help'. Why whould --help be a file?
    I implemented an error message instead, as the usage is probably not intended.
    Is this an issue, as it would not behave 100% like GNU cmp. Try 'cmp -- file_1 --help'. Why whould --help be a file?
  • The error messages are slightly changed, e.g.
    • 'option requires an argument -- 'n'' when only passing -n, but
    • 'option '--bytes' requires and argument', which is nicer to read.
      I do not differentiate here and print:
    • 'option '--bytes' ('-n') requires an argument'
  • Params is generally public and could be used elsewhere, e.g. the number conversion. Should I keep it public?

Before submitting a PR I would like to have some feedback on this and the refactoring.

Check out the new version at my fork.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions