Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ex_multihash instead of :crypto.hash(:sha512, input) #8

Open
5 tasks done
nelsonic opened this issue Dec 10, 2018 · 11 comments
Open
5 tasks done

Use ex_multihash instead of :crypto.hash(:sha512, input) #8

nelsonic opened this issue Dec 10, 2018 · 11 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@nelsonic
Copy link
Member

nelsonic commented Dec 10, 2018

At present, the Erlang :crypto.hash(:sha512, input) is called directly when creating a cid:

cid/lib/cid.ex

Line 25 in 3e8eb82

hash = :crypto.hash(:sha512, input)

In light of the fact that https://github.com/ipld/cid achieves a pretty similar goal,
I propose that we use multihash instead of directly creating a :sha512 hash.
This will ensure forward compatibility and also mean that we can use the JS cid function on the client.

Todo

@nelsonic nelsonic added enhancement New feature or request help wanted Extra attention is needed labels Dec 10, 2018
@nelsonic
Copy link
Member Author

If you need more context on the IPFS/IPLD CID function,
see: dwyl/learn-ipfs#8 and dwyl/learn-ipfs#5

@nelsonic nelsonic changed the title Use ex_multihash instead of invoking Use ex_multihash instead of invoking :crypto.hash(:sha512, input) Dec 10, 2018
@nelsonic nelsonic changed the title Use ex_multihash instead of invoking :crypto.hash(:sha512, input) Use ex_multihash instead of :crypto.hash(:sha512, input) Dec 10, 2018
@RobStallion
Copy link
Member

RobStallion commented Dec 11, 2018

@nelsonic I forked the repo and installed excoveralls to check the code coverage.

I know that they currently only have doctests but I figured if those doctests actually worked then all should be okay with them (unless I am misunderstanding why you might want separate tests)

image

COV    FILE                                        LINES RELEVANT   MISSED
 95.8% lib/multihash.ex                              261       24        1
  0.0% lib/util.ex                                    23        5        5
[TOTAL]  79.3%

There are no tests in the util file. There are only 2 function definitions however and one is a private function that is called by the other function. (What I am trying to say (UNLCEARLY) is that we would only need to test one function here)

There is only one line inmultihash.ex that is not being.

      do: encode(function, digest, length)

It is in the following function

  @spec encode(binary, binary, integer_default) :: on_encode
  def encode(<<_hash_code>> = hash_code, digest, length) when is_binary(digest) do
    with {:ok, function} <- get_hash_function(hash_code),
      do: encode(function, digest, length)
  end

I have called some of the functions in an iex session and have been able to replicate the results from the documentation. I haven't been able to test the function above though.

I can call it but I am not sure what argument to pass in to get past the following line

    with {:ok, function} <- get_hash_function(hash_code) do

as the get_hash_function function is keeps retuning an error for me.

With regards to this point

figure out how to use https://github.com/multiformats/ex_multihash to hash content.

I wouldn't say that I fully understand everything about hashing (right now) but the elixir code in this module is very well written and fairly straightforward to understand. (Plus the examples are pretty great)

@nelsonic
Copy link
Member Author

@RobStallion doctests are only testing the signature of the functions.
I ran the doctests locally too:
image
Warnings galore.

Did you push a commit to GitHub with your addition of excoveralls?

@RobStallion
Copy link
Member

RobStallion commented Dec 12, 2018

@nelsonic the doc tests do appear to be testing the functions

iex> Multihash.encode(:sha1, :crypto.hash(:sha, "Hello"))
{:ok, <<17, 20, 247, 255, 158, 139, 123, 178, 224, 155, 112, 147, 90, 93, 120, 94, 12, 197, 217, 208, 171, 240>>}

This calls the functions with the above arguments and expects the result below. These results can be replicated in your iex shell as well.

Also, if you change the returned binary in anyway the test fails.

The warnings all appear to be of the following...
@doc attribute is always discarded for private functions/macros/types

The only reason this appears to be a thing is because the devs decided to put documentation on their private helper functions.

@nelsonic
Copy link
Member Author

Yeah, Docs in private functions are a good thing. I don't know why Elixir considers them a warning. 🙄
I'm quite confident in the doctest for most of these functions, I think we should consider writing a test for Util.sum for completeness.
Creating a PR improving the test coverage of a project is more of a test of the "responsiveness" of the repo/package/module maintainer. i.e. do they accept PRs that don't change the code but simply improve it's overall reliability?

@RobStallion
Copy link
Member

@nelsonic Upon further inspection, the Multihas.Util/sum/2 function just ends up calling Multihash.encode/3 with a hash_type of :sha1, :sha2_256, :sha2_512.

This is probably the reason that they did not include the tests in the first place. (I should have noticed this on my first pass through 😞)

I can and have added tests for this now (in the same doc test format as the maintainers) and opened a pr here. Do you still want me to open a PR with them?

@RobStallion
Copy link
Member

RobStallion commented Dec 13, 2018

@nelsonic

figure out how to use https://github.com/multiformats/ex_multihash to hash content.

The code below is a modified version of the current Cid.make/2.

It can be tidied and the variables can be given better names (didn't think this was important for proof of concept) but it shows that we can use ex_multihash and still allow our tests to pass.

Original

  def make(input, length \\ 32) do
    hash = :crypto.hash(:sha512, input)

    hash
    |> Base.encode64()
    |> String.replace(~r/[Il0oO=\/\+]/, "", global: true)
    |> String.slice(0..(length - 1))
  end

With ex_multihash

  def make(input, length \\ 32) do
    hash1 = :crypto.hash(:sha512, input)
    {:ok, <<_multihash_code, _length, hash2::binary>>} = Multihash.encode(:sha2_512, hash1)

    hash2
    |> Base.encode64()
    |> String.replace(~r/[Il0oO=\/\+]/, "", global: true)
    |> String.slice(0..(length - 1))
  end

Multihash takes a hash and prepends info onto it.

<hash-func-type><digest-length><digest-value>

In the example above I am simply pattern matching the first 2 values, doing nothing with them and using the hash2 value in what was the original function call.

This is not how we would use it in production I'm sure, but it does show what we can use the multihash_code or the length variables for

@nelsonic
Copy link
Member Author

@RobStallion this is a good start. thanks! 👍
What is your appetite for doing a "deep dive" into how the CID function works?

@RobStallion
Copy link
Member

@nelsonic I am up for diving deep 💯, but just want to be clear on what you mean by CID function.

Do you mean that this example in the readme shows? Essentially create the functionality that the readme talks about?

@nelsonic
Copy link
Member Author

@RobStallion using multihash is part of the puzzle but it's only the "four corners" of the puzzle.
Take a look at https://proto.school dwyl/learn-ipfs#5

@nelsonic
Copy link
Member Author

GOTO: #11

RobStallion added a commit that referenced this issue Jan 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants