Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOME in GmlLoader with large GML Files #1634

Open
lgoltz opened this issue Jan 19, 2024 · 0 comments
Open

OOME in GmlLoader with large GML Files #1634

lgoltz opened this issue Jan 19, 2024 · 0 comments
Labels
bug error issue and bug (fix) tools deegree command line tools (CLI)

Comments

@lgoltz
Copy link
Contributor

lgoltz commented Jan 19, 2024

Importing large GML files can result in an OutOfMemory error.

Iterating over a StreamFeatureCollection with

GMLVersion version = GMLVersion.GML_32;
XMLInputFactory xmlInputFactory = XMLInputFactory.newFactory();
xmlInputFactory.setProperty(XMLInputFactory.IS_COALESCING, true);
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inputStream);
XMLStreamReaderWrapper xmlStream = new XMLStreamReaderWrapper(xmlStreamReader, null);
GMLStreamReader gmlStreamReader = GMLInputFactory.createGMLStreamReader(version, xmlStream);
gmlStreamReader.setApplicationSchema(findSchema());
SkipInternalGmlDocumentIdContext resolver = new SkipInternalGmlDocumentIdContext(version);
resolver.setReferencePatternMatcher(parseDisabledResources());
gmlStreamReader.setResolver(resolver);
StreamFeatureCollection featureStream = gmlStreamReader.readFeatureCollectionStream();

Iterator<Feature> featureIterator = featureStream.iterator();
int numberOfReadFeatures = 0;
while (featureIterator.hasNext()) {
  Feature feature = featureIterator.next();
  ++numberOfReadFeatures;
  if (numberOfReadFeatures % 5000 == 0)
	System.gc();
  if (numberOfReadFeatures % 150000 == 0) {
	System.gc();
	System.exit(0);
  }
}

and enabling logging of the gc (-verbose:gc) results in the following log:

...
[12,008s][info][gc] GC(49) Pause Full (System.gc()) 969M->830M(1404M) 274,616ms
[12,490s][info][gc] GC(50) Pause Full (System.gc()) 992M->868M(1468M) 284,679ms
[12,997s][info][gc] GC(51) Pause Full (System.gc()) 1039M->906M(1534M) 301,604ms
[13,502s][info][gc] GC(52) Pause Full (System.gc()) 1071M->944M(1594M) 304,140ms
[14,022s][info][gc] GC(53) Pause Full (System.gc()) 1109M->981M(1654M) 317,533ms
[14,552s][info][gc] GC(54) Pause Full (System.gc()) 1150M->1019M(1718M) 330,499ms
[15,107s][info][gc] GC(55) Pause Full (System.gc()) 1189M->1057M(1784M) 346,625ms
[15,669s][info][gc] GC(56) Pause Full (System.gc()) 1229M->1096M(1850M) 354,527ms
[16,232s][info][gc] GC(57) Pause Full (System.gc()) 1261M->1133M(1908M) 361,937ms
[16,806s][info][gc] GC(58) Pause Full (System.gc()) 1300M->1171M(1970M) 371,646ms
[17,167s][info][gc] GC(59) Pause Full (System.gc()) 1171M->1171M(1970M) 360,479ms

Memory increases by ~38M per 5000 features.

In

public void addObject(GMLObject object) {
String id = object.getId();
if (id != null && id.length() > 0) {
idToObject.put(object.getId(), object);
}
}

all features are kept in the GmlDocumentIdContext and cannot be cleaned up by the gc.

@tfr42 tfr42 added bug error issue and bug (fix) tools deegree command line tools (CLI) labels Jan 24, 2024
@tfr42 tfr42 changed the title OOM in GmlLoader with large GML Files OOME in GmlLoader with large GML Files Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug error issue and bug (fix) tools deegree command line tools (CLI)
Projects
None yet
Development

No branches or pull requests

2 participants