Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pile.Cache generating an avalanche of "Object reference not set to an instance of an object." #36

Open
agnibos opened this issue Jan 20, 2018 · 6 comments

Comments

@agnibos
Copy link
Contributor

agnibos commented Jan 20, 2018

This happens on table.sweep and on table.fethExistingEntry - one the memory corrupts the error keeps generating on subsequent sweeps:

Sweep exception:

@20180120-022702|8216d599-fcbc-41ca-a4ff-45c4a68590fa|......zw02||
Critical|Data.Cache|LocalCache('MDBDataStore::GraphSystem').threadSpin().foreach.Sweep|0
Leaked exception while sweeping table 'GraphSystemService.Node': [System.NullReferenceException] Object reference not set to an instance of an object.'

  +-Exception 
  | Type      System.NullReferenceException
  | Source    NFX
  | Target    sweep
  | Message   Object reference not set to an instance of an object.
  | Stack     
     at NFX.ApplicationModel.Pile.LocalCacheTable`1._bucket.sweep()
     at NFX.ApplicationModel.Pile.LocalCacheTable`1.Sweep(Stopwatch timer, Int32 maxTimeMs)
     at NFX.ApplicationModel.Pile.LocalCache.threadSpin()

Access/Get Exception:

@20180120-022703|9602f37a-26cc-462b-95cd-d9fe04d05431|.......zw02||
Error|AppMgmt|Agni.Social.Graph.Server.GraphSystemService.GetNode|0
[System.NullReferenceException] Object reference not set to an instance of an object.

  +-Exception 
  | Type      System.NullReferenceException
  | Source    NFX
  | Target    fetchExistingEntry
  | Message   Object reference not set to an instance of an object.
  | Stack     
     at NFX.ApplicationModel.Pile.LocalCacheTable`1.fetchExistingEntry(_bucket bucket, TKey key, Int32 hashCode)
     at NFX.ApplicationModel.Pile.LocalCacheTable`1.Get(TKey key, Int32 ageSec)
     at NFX.ApplicationModel.Pile.CacheExtensions.FetchThrough[TKey,TResult](ICache cache, TKey key, String tblCache, ICacheParams caching, Func`2 fFetch, Func`3 fFilter)
     at Agni.Social.Graph.Server.GraphSystemService.DoGetNode(GDID gNode, ICacheParams cacheParams)
     at Agni.Social.Graph.Server.GraphSystemService.GetNode(GDID gNode)
@itadapter
Copy link
Contributor

Yes, this happened 12/19, then today, it happens very infrequently a'la heisenbug - which makes me think that this is multi-threading related issue (improper locking/barrier/sequencing/race)

@agnibos
Copy link
Contributor Author

agnibos commented Jan 20, 2018

See LocalCacheTable.cs#L161, the problem is that you can not use -1 as a flag, as the Age gets updated by thread all the time, so non-Chain entities get interpreted as "chain".

Why does this happen? Simple: clock drift. It returns negative time delta in future, this effectively sets Age to <0 which triggers IsChain==true but the typecast is not checked later, hence NULL REF

can not use this flag <1
maybe add another field?

@itadapter
Copy link
Contributor

itadapter commented Jan 20, 2018

The _entry gobbles up ram like crazy, we have to be mindful with additional field creation as it makes these _entry[] bigger and bigger

But at least we have figured it out!

@itadapter
Copy link
Contributor

Was fixed by 8c45548
But needs more extensive testing. Keep issue open for now

@agnibos
Copy link
Contributor Author

agnibos commented Mar 28, 2018

Guys, any news on this? Have not heard anything bad, close?

@itadapter
Copy link
Contributor

The issues was resolved.
Lets keep it open for another month just in case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants