Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not to sync the etcd metadata when get object with ipc client. #1880

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dashanji
Copy link
Member

@dashanji dashanji commented Apr 24, 2024

What do these changes do?

Reproduce Steps

When we connect to an external vineyardd with etcd metadata, the latency is high as we'll sync the metadata of an object from the metadata service even if it's a local one.

$ vineyardd --socket /tmp/vineyard_test.sock
import vineyard
import numpy as np

client = vineyard.connect('/tmp/vineyard_test.sock')

data = np.arange(100000)
oid = client.put(data)

import timeit

elapsed_time = timeit.timeit("client.get(oid)", globals=globals(), number=10) / 10

print(f"Average time: {elapsed_time} seconds")

Original Result

Average time:  0.0005963184870779514 seconds

Result with this PR

Average time:  00034987758845090867 seconds

Notice, this pr will not address the comment as it uses an internal vineyardd with local metadata.

Related issue number

Fix part of #1877

Signed-off-by: Ye Cao <caoye.cao@alibaba-inc.com>
@sighingnow
Copy link
Member

The synchronization is by-design to avoid "put but cannot get" problem.

@dashanji
Copy link
Member Author

dashanji commented Apr 25, 2024

The synchronization is by-design to avoid "put but cannot get" problem.

Thanks for the details. The current get function will get the metadata first (https://github.com/v6d-io/v6d/blob/main/python/vineyard/core/client.py#L577), maybe we can add a parameter in the GetObject to avoid getting the metadata twice. What do you think?

@dashanji
Copy link
Member Author

dashanji commented Apr 25, 2024

In addition, currently it seems that the implementation of python resolver does take a lot of time. Do you have any suggestions? Thanks!

./bin/vineyardd --socket /tmp/vineyard_test.sock --meta local
import vineyard
import numpy as np

client = vineyard.connect('/tmp/vineyard_test.sock')

data = np.arange(100000)
oid = client.put(data)

import time
s_time = time.time()
client._ipc_client.get(oid)
e_time = time.time()
elapsed_time = e_time - s_time

print(f"Average time: {elapsed_time} seconds")

Use the line_profiler (https://github.com/pyutils/line_profiler) to profile the python code.

$ kernprof -lv ../test.py
Average time: 0.0005114078521728516 seconds
Wrote profile results to test.py.lprof
Timer unit: 1e-06 s

Total time: 0.000226283 s
File: /opt/caoye/v6d/python/vineyard/core/resolver.py
Function: run at line 58

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    58                                               @profile
    59                                               def run(self, obj: Any, **kw):
    60         1          5.6      5.6      2.5          typename = obj.meta.typename
    61         1         16.7     16.7      7.4          prefix, resolver = find_most_precise_match(typename, self._factory)
    62         1          0.9      0.9      0.4          vineyard_client = kw.pop('__vineyard_client', None)
    63         1          0.3      0.3      0.1          if prefix:
    64         1         53.3     53.3     23.6              resolver_func_sig = inspect.getfullargspec(resolver)
    65         1          0.6      0.6      0.3              if resolver_func_sig.varkw is not None:
    66                                                           value = resolver(obj, resolver=self, **kw)
    67                                                       else:
    68         1          0.1      0.1      0.1                  try:
    69                                                               # don't pass the `**kw`.
    70         1          0.6      0.6      0.3                      if 'resolver' in resolver_func_sig.args:
    71                                                                   value = resolver(obj, resolver=self)
    72                                                               else:
    73         1        134.7    134.7     59.5                          value = resolver(obj)
    74                                                           except Exception as e:
    75                                                               raise RuntimeError(  # pylint: disable=raise-missing-from
    76                                                                   'Unable to construct the object using resolver: '
    77                                                                   'typename is %s, resolver is %s' % (obj.meta.typename, resolver)
    78                                                               ) from e
    79         1          0.3      0.3      0.1              if value is None:
    80                                                           # if the obj has been resolved by pybind types, and there's no proper
    81                                                           # resolver, it shouldn't be an error
    82                                                           if type(obj) is not Object:  # pylint: disable=unidiomatic-typecheck
    83                                                               return obj
    84                                           
    85                                                           # we might `client.put(None)`
    86                                                           return None
    87                                           
    88                                                       # associate a reference to the base C++ object
    89         1          0.1      0.1      0.1              try:
    90         1          1.1      1.1      0.5                  setattr(value, '__vineyard_ref', obj)
    91         1          0.4      0.4      0.2                  setattr(value, '__vineyard_client', vineyard_client)
    92                                           
    93                                                           # register methods
    94         1          3.3      3.3      1.4                  from vineyard.core.driver import get_current_drivers
    95                                           
    96         1          8.0      8.0      3.5                  get_current_drivers().resolve(value, obj.typename)
    97                                                       except AttributeError:
    98                                                           pass
    99                                           
   100         1          0.2      0.2      0.1              return value
   101                                                   # keep it as it is
   102                                                   return obj

Total time: 1.934e-06 s
File: /opt/caoye/v6d/python/vineyard/core/resolver.py
Function: get_current_resolvers at line 123

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   123                                           @profile
   124                                           def get_current_resolvers() -> ResolverContext:
   125                                               '''Obtain current resolver context.'''
   126         1          1.2      1.2     64.4      default_resolver = getattr(_resolver_context_local, 'default_resolver', None)
   127         1          0.5      0.5     26.4      if not default_resolver:
   128                                                   default_resolver = default_resolver_context.extend()
   129         1          0.2      0.2      9.2      return default_resolver

Total time: 0.000477614 s
File: /opt/caoye/v6d/python/vineyard/core/resolver.py
Function: get at line 194

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   194                                           @profile
   195                                           def get(
   196                                               client,
   197                                               object_id: Optional[ObjectID] = None,
   198                                               name: Optional[str] = None,
   199                                               resolver: Optional[ResolverContext] = None,
   200                                               fetch: bool = False,
   201                                               **kwargs
   202                                           ):
   203                                               """Get vineyard object as python value.
   204                                           
   205                                               .. code:: python
   206                                           
   207                                                   >>> arr_id = vineyard.ObjectID('00002ec13bc81226')
   208                                                   >>> arr = client.get(arr_id)
   209                                                   >>> arr
   210                                                   array([0, 1, 2, 3, 4, 5, 6, 7])
   211                                           
   212                                               Parameters:
   213                                                   client: IPCClient or RPCClient
   214                                                       The vineyard client to use.
   215                                                   object_id: ObjectID
   216                                                       The object id that will be obtained from vineyard.
   217                                                   name: ObjectID
   218                                                       The object name that will be obtained from vineyard, ignored if
   219                                                       ``object_id`` is not None.
   220                                                   resolver:
   221                                                       When retrieving vineyard object, an optional *resolver* can be specified.
   222                                                       If no resolver given, the default resolver context will be used.
   223                                                   fetch:
   224                                                       Whether to trigger a migration when the target object is located on
   225                                                       remote instances.
   226                                                   kw:
   227                                                       User-specific argument that will be passed to the builder.
   228                                           
   229                                               Returns:
   230                                                   A python object that return by the resolver, by resolving an vineyard object.
   231                                               """
   232                                               # wrap object_id
   233         1          0.7      0.7      0.1      if object_id is not None:
   234         1          2.2      2.2      0.5          if isinstance(object_id, (int, str)):
   235                                                       object_id = ObjectID(object_id)
   236                                               elif name is not None:
   237                                                   object_id = client.get_name(name)
   238                                               
   239         1        218.8    218.8     45.8      obj = client.get_object(object_id)
   240         1          0.3      0.3      0.1      if resolver is None:
   241         1         11.2     11.2      2.4          resolver = get_current_resolvers()
   242         1        244.4    244.4     51.2      return resolver(obj, __vineyard_client=client, **kwargs)

@github-actions github-actions bot added the stale label May 12, 2024
Copy link
Contributor

/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.

@sighingnow
Copy link
Member

You could add a new option instead, rather than change the default behavior, as lack of the sync may cause new bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants