Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bad_alloc #1

Open
ronghongbo opened this issue Jul 28, 2016 · 1 comment
Open

bad_alloc #1

ronghongbo opened this issue Jul 28, 2016 · 1 comment

Comments

@ronghongbo
Copy link

Hello Jing,

Here is a small test case that runs into a bad_alloc error:

#include "Halide.h"
#include <stdio.h>
using namespace Halide;
int main(int argc, char **argv) {
    Func e, f, g;
    Var x;
    e(x) =x;
    f(x) = e(x);
    g(x) = f(x); 

    Var xi, xo;
    g.split(x, xo, xi, 16).accelerate({e}, xi, xo);
    f.linebuffer();

    Image<int> out = g.realize(100);//, target);
    return 0;
}

Halide-HLS]$ g++ -std=c++11 -g -fno-omit-frame-pointer -fno-rtti -Wall -Werror -Wno-unused-function -Wcast-qual -Wignored-qualifiers -Wno-comment -Wsign-compare -O3 test/correctness/gpu_dynamic_shared.cpp -Iinclude -Lbin -lHalide -lpthread -ldl -lz -rdynamic -Wl,--rpath=/home/hrong/Halide-HLS/bin -o t

Halide-HLS]$ ./t
Warning at test/correctness/gpu_dynamic_shared.cpp:34:
No linebuffer inserted after function f.
Warning at test/correctness/gpu_dynamic_shared.cpp:34:
No linebuffer inserted after function .
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)

Thanks!
Hongbo

@jingpu
Copy link
Owner

jingpu commented Jul 28, 2016

@ronghongbo The HLS backend cannot be used with JIT (i.e. func.realize()) yet.
The fanout example is similar to your example. Maybe you can start from that example.
https://github.com/jingpu/Halide-HLS/blob/HLS/apps/hls_examples/fanout_hls/pipeline.cpp

jingpu pushed a commit that referenced this issue Aug 30, 2016
Tools::Image<T> constructs from array
jingpu pushed a commit that referenced this issue Jun 14, 2017
Two changes:

1) Use 32 free bits in the IRNode to store the IRNodeType, so that it
can be gotten at by a load instead of having to call a virtual function

2) Change things to a style where any function that's going to make a
copy of an Expr takes it by value, and then does a std::move internally
at its last use. This avoids a bunch of atomic ops and conditional
branches in caller code in the case where you're passing in an rvalue.

With these changes, lowering local laplacian gets about 12% faster (1.5s
-> 1.33s). Most of the win is from change #1

These are being done in advance of a planned change to simplify the
simplifier to be more concise and use less stack space, so hopefully the
fact that this represents a partial reversion of #1810 won't bite us.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants