You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I use python multi-process and vaex, I want to save the text as embedding. Everything is normal in the early stage of the program running, but after a while, the saved hdf5 becomes like this:
everything is lost, here is my code:
`import gzip
import hashlib
import json
import logging
import os
import warnings
from multiprocessing import Pool
import numpy as np
import vaex
from sentence_transformers import SentenceTransformer
When I use python multi-process and vaex, I want to save the text as embedding. Everything is normal in the early stage of the program running, but after a while, the saved hdf5 becomes like this:
everything is lost, here is my code:
`import gzip
import hashlib
import json
import logging
import os
import warnings
from multiprocessing import Pool
import numpy as np
import vaex
from sentence_transformers import SentenceTransformer
warnings.filterwarnings("ignore")
matching_files = ["x1.json.gz", "x2.json.gz", "x3.json.gz", ...]
print("TOTAL # JOBS:", len(matching_files))
print(matching_files)
def save_embedding(file_path):
cuda_num = int(file_path.split(".")[0][-4:]) % 8
save_name = file_path.split("/")[-1].split(".")[0]
save_path = "xxx"
with Pool(8) as p:
p.map(save_embedding, matching_files)
`
The text was updated successfully, but these errors were encountered: