Skip to content

Commit ace9d19

Browse files
authored
Kernel reimplementation and bug fixes (#1)
* Init refactor * Add correlator interface * Update options file * Implement alloc and reset functions * Implement constructor and distructors * Add basic main * Add function call to kernel * Partially Implement correlation kernel * Improve help message * Improve help message * Continue kernel implementation * Continue kernel re-implementation * Continue kernel re-implementation * Finish kernel implementation (hopefully) * Add transfer function * Remove unused macros * Add debug build * Update debug build * Add utils * Update options * Update example main * Fix bug in memory allocation * Fix memory alignment warning * Partial fix correlate kernel * Reimplement kernel... * Implement kernel * Finish kernel implementation * Update debug build * Update get function and generate taus * Fix bug in autocorrelation kernel * Add timing to main * Update main * Update options * Naming consistency * Update run_all.sh script * Update run_all.sh script * Remove num_Sensors_per_block as parameter * Fix kernel implementation * Update main * Fix kernel implementation * Update output (temporary to checks results) * Update tau generation * Update output results (exec time are temporary) * Fix kernel (maybe) * Update results * Update outputs * Update run all script * Memory optimization * Update results * Remove useless barriers * Update results * Improve insert_until_bin kernel * Change data types for less memory usage and faster controls * Update results * Improve debug info * Update stuff * Remove warning print * Update results * Fix bugs * Fix memory misalignemt * Update outputs * Update time results * Update timing results for 30000 (64 bit needed) * Update run configuration * Add comments
1 parent c5346e3 commit ace9d19

File tree

187 files changed

+90402
-91185
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

187 files changed

+90402
-91185
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,6 @@ out/
33
*.vcxproj.*
44
*.txt
55
main
6-
documentation/Doxygen/
6+
documentation/Doxygen/
7+
bin/
8+
.vscode/

Makefile

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,21 @@
1-
NVCC=nvcc
2-
CXX=g++
1+
CXX=nvcc
32

4-
CXXFLAGS=-Ofast -std=c++11 -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES
5-
LIBS=-lcudart
6-
FLAGS=-Xcompiler -fopenmp -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES
3+
CUDAFLAGS=--ptxas-options=-v -m64 -arch compute_61 -code sm_61 -Xptxas -dlcm=ca -Xcompiler -D_FORCE_INLINES -lineinfo --expt-relaxed-constexpr
74

8-
CUDAFLAGS=--ptxas-options=-v -O4 -m64 -arch compute_61 -code sm_61 -Xptxas -dlcm=ca -Xcompiler -D_FORCE_INLINES -lineinfo
5+
BIN_FOLDER=bin
6+
SRC_FOLDER=src
97

10-
all: main
8+
FILES=${SRC_FOLDER}/main.cu ${SRC_FOLDER}/correlator.cu
9+
10+
all: release
1111

1212
clean:
13-
rm -f *.o main out_data.txt timer_out.txt
13+
rm -rf $(BIN_FOLDER)
1414

15-
main: src/CudaInput.h src/InputVector.h src/BinGroupsMultiSensorMemory.h src/DataFile.h src/Timer.h src/Main.cu src/SensorsDataPacket.h src/ResultArray.h src/options.hpp src/utils.hpp src/Definitions.h
16-
$(NVCC) $(CUDAFLAGS) src/Main.cu -o main
15+
release:
16+
mkdir -p $(BIN_FOLDER)
17+
$(CXX) $(FILES) $(CUDAFLAGS) -O3 -o $(BIN_FOLDER)/main
1718

19+
debug:
20+
mkdir -p $(BIN_FOLDER)
21+
$(CXX) $(FILES) $(CUDAFLAGS) -D_DEBUG_BUILD -g -G -o $(BIN_FOLDER)/main
-49.4 KB
Binary file not shown.
-39.2 KB
Binary file not shown.
-265 KB
Binary file not shown.

documentation/Images/BinGroup.png

-7.9 KB
Binary file not shown.
Binary file not shown.

documentation/Images/SharedMemory.png

-21.9 KB
Binary file not shown.
-333 KB
Binary file not shown.

output/g16-l10/10

Lines changed: 30 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,30 @@
1-
3.0204e-05
2-
2.7346e-05
3-
2.7161e-05
4-
2.6487e-05
5-
2.6146e-05
6-
2.652e-05
7-
2.7012e-05
8-
4.4106e-05
9-
2.6444e-05
10-
2.6492e-05
11-
2.679e-05
12-
2.5983e-05
13-
2.6658e-05
14-
4.1025e-05
15-
2.6642e-05
16-
2.6806e-05
17-
2.5854e-05
18-
3.4931e-05
19-
2.6795e-05
20-
2.9797e-05
21-
3.438e-05
22-
2.6897e-05
23-
3.7489e-05
24-
3.9591e-05
25-
4.1523e-05
26-
2.6488e-05
27-
2.676e-05
28-
3.6691e-05
29-
2.6617e-05
30-
2.6519e-05
1+
3.2256e-05
2+
4.0013e-05
3+
4.7933e-05
4+
3.7918e-05
5+
6.3048e-05
6+
4.7005e-05
7+
3.1938e-05
8+
3.1058e-05
9+
4.5578e-05
10+
2.4995e-05
11+
3.7842e-05
12+
3.7918e-05
13+
3.8048e-05
14+
2.9517e-05
15+
3.8357e-05
16+
3.7484e-05
17+
3.0092e-05
18+
3.7669e-05
19+
3.753e-05
20+
3.7721e-05
21+
3.7534e-05
22+
3.8252e-05
23+
3.942e-05
24+
3.2329e-05
25+
3.1416e-05
26+
3.7708e-05
27+
3.769e-05
28+
3.6285e-05
29+
3.6966e-05
30+
4.6296e-05

0 commit comments

Comments
 (0)