Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of Memory error. #9

Open
srikarpv opened this issue Mar 21, 2019 · 1 comment
Open

Out of Memory error. #9

srikarpv opened this issue Mar 21, 2019 · 1 comment

Comments

@srikarpv
Copy link

In startup phase 1, I get the following out of memory error when I run python run.py my_experiment --map MoveToBeacon
Any leads would be appreciated.

The following is my terminal dump.

pygame 1.9.4
Hello from the pygame community. https://www.pygame.org/contribute.html
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Version: B55958 (SC2.3.16)
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 19052 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-_w582aqm/ -displayMode 0'
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 23308 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-y5a4tnsq/ -displayMode 0'
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 23486 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-cgnye12m/ -displayMode 0'
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 18106 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-h363mdtm/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 18236 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-npywv3_g/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 22720 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-z9abmp21/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 23817 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-146rzc5n/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 22203 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-ni7u77ob/ -displayMode 0'
Starting up...
Starting up...
Starting up...
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 19426 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-b6_448vu/ -displayMode 0'
Starting up...
Starting up...
Starting up...
Starting up...
Startup Phase 1 complete
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 23237 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-rgq3n78x/ -displayMode 0'
Startup Phase 1 complete
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 21354 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-c4b8oyug/ -displayMode 0'
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 21269 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-d0ttex2p/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 24687 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-1aa1mjj9/ -displayMode 0'
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 20569 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-fm5azxn0/ -displayMode 0'
Version: B55958 (SC2.3.16)
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 21933 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-97_cy5ll/ -displayMode 0'
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 18875 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-ns0i4ous/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Starting up...
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 22451 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-24znipto/ -displayMode 0'
Startup Phase 1 complete
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 24407 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-7xu9mzlx/ -displayMode 0'
Starting up...
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 24325 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-yd1rflf9/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 23030 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-pheqfwka/ -displayMode 0'
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Starting up...
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 22971 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-h4xafyee/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 17209 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-ykzflkd7/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 24343 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-uc2t3v7v/ -displayMode 0'
Startup Phase 1 complete
Starting up...
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 17383 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-x_hy40yp/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 16438 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-_f6l_k8c/ -displayMode 0'
2019-03-21 14:15:28.637558: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 19645 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-omg8b3qj/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 24530 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-axjsi41r/ -displayMode 0'
Starting up...
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 16431 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-b_6uv3uq/ -displayMode 0'
Starting up...
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 22425 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-2hybgjl3/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 21173 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-ep7z7wtq/ -displayMode 0'
Starting up...
Starting up...
Starting up...
Starting up...
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 15643 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-gzpg6xnm/ -displayMode 0'
Version: B55958 (SC2.3.16)
Build: Jul 31 2017 13:19:41
Command Line: '"/home/srikar/StarCraftII/Versions/Base55958/SC2_x64" -listen 127.0.0.1 -port 23582 -dataDir /home/srikar/StarCraftII/ -tempDir /tmp/sc-_bue2gzo/ -displayMode 0'
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Starting up...
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Startup Phase 1 complete
Starting up...
2019-03-21 14:15:28.828800: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-21 14:15:28.829287: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
totalMemory: 3.95GiB freeMemory: 48.69MiB
2019-03-21 14:15:28.829317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Starting up...
Starting up...
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Startup Phase 1 complete
Traceback (most recent call last):
File "run.py", line 174, in
main()
File "run.py", line 111, in main
sess = tf.Session()
File "/home/srikar/anaconda3/envs/tf-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1551, in init
super(Session, self).init(target, graph, config=config)
File "/home/srikar/anaconda3/envs/tf-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 676, in init
self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Listening on: 127.0.0.1:18875 (18875)
Listening on: 127.0.0.1:22203 (22203)
Startup Phase 3 complete. Ready for commands.
Startup Phase 3 complete. Ready for commands.
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
W0321 14:16:02.230587 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:02.234593 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:02.622623 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:02.681355 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:02.732240 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:03.222280 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:03.222583 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:03.685055 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:03.812864 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:03.857660 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:03.862667 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
W0321 14:16:04.478714 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:04.551949 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:04.553689 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:04.993700 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
Startup Phase 2 complete
Creating stub renderer...
W0321 14:16:05.520868 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:05.521188 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:05.727763 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:05.786274 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:05.837595 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:06.062572 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:06.081720 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:06.171552 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:06.177643 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:06.327630 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:06.327720 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:06.400193 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
Startup Phase 2 complete
Creating stub renderer...
W0321 14:16:06.790266 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:06.918255 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:06.924500 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:06.963078 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:06.967833 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:06.996931 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:07.069621 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:07.422630 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:07.584017 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:07.657401 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:07.658561 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:08.011750 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:08.099080 139707238745920 sc_process.py:183] Killing the process.
Startup Phase 2 complete
Creating stub renderer...
W0321 14:16:09.105360 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:09.168418 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:09.186958 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:09.276766 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:09.282816 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:09.505211 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:10.029559 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:10.102171 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:10.174578 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:04.024830 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:10.527849 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:10.800231 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:11.116753 139707238745920 sc_process.py:183] Killing the process.
Startup Phase 2 complete
Creating stub renderer...
W0321 14:16:12.210401 139707238745920 sc_process.py:183] Killing the process.
W0321 14:16:12.789056 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
Requesting to join a single player game
Requesting to join a single player game
Configuring interface options
Configuring interface options
Configure: raw interface disabled
Configure: raw interface disabled
Configure: feature layer interface enabled
Configure: feature layer interface enabled
Configure: score interface enabled
Configure: score interface enabled
Configure: render interface disabled
Configure: render interface disabled
Entering load game phase.
Entering load game phase.
Launching next game.
Launching next game.
W0321 14:16:13.328135 139707238745920 sc_process.py:183] Killing the process.
Failed to create the socket.
W0321 14:16:13.905472 139707238745920 sc_process.py:183] Killing the process.
Failed to create the socket.
Next launch phase started: 2
Next launch phase started: 2
Next launch phase started: 3
Next launch phase started: 3
Next launch phase started: 4
Next launch phase started: 4
Next launch phase started: 5
Next launch phase started: 5
Next launch phase started: 6
Next launch phase started: 6
Next launch phase started: 7
Next launch phase started: 7
Next launch phase started: 8
Next launch phase started: 8
W0321 14:16:15.894840 139707238745920 sc_process.py:183] Killing the process.
Failed to create the socket.
Listening on: 127.0.0.1:21269 (21269)
Startup Phase 3 complete. Ready for commands.
Failed to create the socket.
Failed to create the socket.
Failed to create the socket.
W0321 14:16:39.019435 139707238745920 sc_process.py:146] SC2 isn't running, so bailing early on the websocket connection.
W0321 14:16:42.124873 139707238745920 sc_process.py:183] Killing the process.
Failed to create the socket.
^Z^[
[1]+ Stopped python run.py my_experiment --map MoveToBeacon

@erickson656
Copy link

Try reducing the number of parallel instances from the default of 32 to something more manageable. This helped for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants