-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
325 lines (323 loc) · 17.6 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
<!DOCTYPE html>
<html lang="en-US">
<head>
<title>Lefteris Sidirourgos -- Λευτέρης Σιδηρουργός</title>
<meta charset="UTF-8"/>
<meta name="keywords" content="Lefteris, Eleftherios, Sidirourgos, Λευτέρης, Ελευθέριος, Σιδηρουργός">
<meta name="author" content="Lefteris Sidirourgos">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="w3.css">
<link rel="stylesheet" href="page.css">
</head>
<body>
<!-- begin header -->
<header>
<div class="w3-content w3-display-container" style="max-width:1000px; height:160px;" >
<div class="w3-display-left" style="display:inline-block;">
<figure style="display:inline-block; vertical-align:middle;">
<img class="w3-round" src="imgs/lefterios.jpg" alt="Lefteris Sidirourgos - Λευτέρης Σιδηρουργός" height="160px">
</figure>
<div style="display:inline-block; vertical-align:middle; text-align:left; padding:10px;">
<div id=name>Lefteris Sidirourgos</div>
<div id=name>Λευτέρης Σιδηρουργός</div>
<div id=affiliation></div>
<div id=affiliation>MADgIK, Dept of Informatics</div>
<div id=affiliation>University of Athens</div>
</div>
</div>
<div class="w3-display-right">
<figure style="display:inline-block; vertical-align:middle;">
<a href="http://www.madgik.di.uoa.gr/" target="_blank">
<img src="imgs/madgik.png" alt="madgik">
</a>
</figure>
</div>
</div>
</header>
<!-- end header -->
<nav>
<div class="w3-content" style="max-width:1000px">
<div style="display:flex; justify-content:space-around;">
<a href="#">research</a>
<a href="#publications">publications</a>
<a href="#bio">bio</a>
<a href="#talks">talks</a>
<a href="#service">service</a>
<a href="#contact">contact</a>
</div>
</div>
</nav>
<main class="w3-content" style="max-width:1000px" >
<!--begin of research-->
<section id=research>
<p style="text-align:left">
My research area of interest is in Database Architecture. I obtained a PhD from University of
Amsterdam while I was a member of the Database Architecture Group of CWI in Amsterdam. I have
been a core developer for the MonetDB open source database system since 2007.
</p>
<p>
Previous affiliations include:
<ul>
<li><span class="date">2017-2018</span> Post-Doctoral Researcher at the
<a href="https://www.systems.ethz.ch/" target="_blank">Systems Group</a>
of ETH Zürich.
<a href="https://www.ethz.ch/en.html" style="display:inline-block; vertical-align:top;" target="_blank">
<img src="imgs/ethz_logo_black.svg" alt="eth" height="20px">
</a>
</li>
<li><span class="date">2007-2017</span> Researcher at the
<a href="https://www.cwi.nl/research/groups/database-architectures" target="_blank">
Database Architectures
</a>
group of CWI, Amsterdam.
<a href="https://www.cwi.nl" style="display:inline-block; vertical-align:middle;" target="_blank">
<img src="imgs/cwi_logo.png" alt="cwi" height="40px">
</a>
</li>
<li>
In the summer of <span class="date">2012</span>, I was a research intern at
<a href="https://www.microsoft.com/en-us/research/lab/microsoft-research-redmond/" target="_blank">
Microsoft Research Lab</a>, in Redmond, WA.
</li>
<li><span class="date">2001-2005</span> Member of the
<a href="http://www.ics.forth.gr/isl/index_main.php?l=e&c=253" target="_blank">
Information Systems Laboratory
</a> at ICS-FORTH, Heraklion, Crete.
<a href="http://www.ics.forth.gr/index.html" style="display:inline-block; vertical-align:middle;" target="_blank">
<img src="imgs/icsforth_logo.jpg" alt="ics-forth" height="50px">
</a>
</li>
</ul>
</p>
<p>
My github <a href="https://github.com/lsidir" target="_blank">profile <img src="imgs/github_logo.png" alt="github" height="30px"></a> with various repositories such as the stand-alone/simd implementation of <a href="https://github.com/lsidir/imprints" target="_blank">Column Imprints</a>.
</p>
<p>
Most of my free time at work is consumed by
<a href="https://www.monetdb.org/Home" style=" vertical-align:middle;" target="_blank">
<img src="imgs/monetdblogo.png" alt="monetdb" height="50px">
MonetDB: The column-store pioneer</a>. My contributions to MonetDB include supporting/maintaining the GDK kernel, the join, select, and sampling
operators, bug fixing, the original implementation of column imprints and ordered index, as well as
code for supporting non-relational data models such as RDF and XML.
</p>
<p>My PhD research was funded by the Netherlands Organisation for Scientific Research <a href="https://www.nwo.nl/en/" target="_blank"><img src="imgs/nwo_logo.png" alt="nwo" height="30px"></a> under the project <a href="http://www.narcis.nl/research/RecordID/OND1336213/Language/en" target="_blank">Querying while Transforming Large Graph Databases</a>.
</p>
<p align="right"><a href="#">back to top</a></p>
</section>
<!-- end of research-->
<section id=publications class=publications>
<h1>Publications</h1>
<a href="http://dl.acm.org/author_page.cfm?id=81388596471" target="_blank" rel="external">ACM Digital Library</a>
-
<a href="http://dblp2.uni-trier.de/pers/hd/s/Sidirourgos:Lefteris" target="_blank" rel="external">DBLP</a>
-
<a href="http://scholar.google.ch/citations?user=SAJM3PkAAAAJ&hl=en" target="_blank" rel="external">Google Scholar</a>
<!--
<h2>2017</h2>
<table id=pub>
<tr>
<td>[c9]</td>
<td>
Sidirourgos, Lefteris, and Hannes M\"uhleisen. <i>"Scaling Column Imprints using Advanced Vectorization."</i> 13th International Workshop on Data Management on New Hardware (DaMoN 2017). In proceedings of the 2017 ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD'17). ACM, 2017
</td>
</tr>
</table>
<h2>2013</h2>
<h2>2012</h2>
<h2>2008</h2>
N2. Sidirourgos Lefteris, Column-Store Support for RDF Data Management: not all swans are white. In Dutch-Belgian Database Day. Namur, Belgium, Oct 10, 2008</p>
N3.Θεσιακός Χειρισμός των Ενημερώσεων Τιμών σε Βάσεις Στηλικής Αποθήκευσης
Sandor Heman (Vectorwise), Marcin Zukowski (Vectorwise), Niels Nes (Centrum Wiskunde &
Informatica), Λευτέρης Σιδηρουργός (Centrum Wiskunde & Informatica), Peter Boncz
(Centrum Wiskunde & Informatica)
-->
<p align="right"><a href="#">back to top</a></p>
</section>
<section id=bio>
<h1> Bio </h1>
<!--p><a href="cv/sidirourgos_cv_gr.pdf" target="_blank" download>Βιογραφικό</a> </p-->
<p>
Lefteris Sidirourgos is a postdoctoral researcher at the Systems Group of ETH Zürich. Before that, he was a postdoctoral
researcher at the Database Architectures Group of CWI in Amsterdam. He received his Ph.D. in Computer Science from the
University of Amsterdam. He also holds a M.Sc. in Computer Science and a B.Sc in Mathematics from University of Crete, Greece.
He has been a core developer for the MonetDB open source database system for the past 10 years. He has also worked with
other commercial database engines such as Microsoft Hekaton, Actian Vectorwise, and LogicBlox. His main research interests
include indexing, storage, compression, sampling, approximate query processing, and graph processing. He always
looks for opportunities to expand his research into new areas of computer science.
</p>
<p><a href="cv/sidirourgos_cv.pdf" target="_blank" download>Curriculum Vitae</a></p>
<p align="right"><a href="#">back to top</a></p>
</section>
<!-- begin talks-->
<section id=talks>
<h1>Invited Talks</h1>
<div class="title">A Nexus of Indexes</div>
<span class="date">11.2016</span>
<span class="location">Systems Group, ETH Zürich</span>
<details>
<summary>Abstract</summary>
<p>
The large memories available for read-optimized databases have changed the landscape of index creation and exploitation.
Instead of complex data structures aimed at fast concurrent OLTP updates, modern column stores rely on fast scans over
(sorted) partitions and/or the use of partial indices. In this talk we will present two such indexes, namely imprints
and secondary projections. These indexes are designed to alleviate the storage duplication caused by complex primary
indexes and support a compact representation of intermediates in the order required for the remainder of the query
execution plan. We will also present our vision for a nexus of indexes, where an index is not a stand-alone structure
inside a database engine, but part of a series of connections linking two or more indexes. Finally, we will discuss
future work on hardware-driven index design that extends across the three layers of shared-memory programming.
</p>
</details>
</p>
<div class="title">A Database System with Amnesia</div>
<span class="date">09.2016</span>
<span class="location">CWI Scientific Meeting, Amsterdam</span>
<details>
<summary>Abstract</summary>
<p>
Big Data comes with huge challenges. Its volume and velocity makes handling, curating, and analytical processing a
costly affair. Even to simply “look at” the data within an a priori defined budget and with a guaranteed interactive
response time frame might be impossible. Scale-out approaches will hit the technology and monetary wall soon,
if not done so already. Blindly dumping data when the channel is full, or reducing the data resolution at the source,
might lead to loss of valuable observations. An army of well-educated database administrators or full software
stack architects might deal with the challenges, but a seemingly knobless DBMS is to be preferred. A fundamental change
in database management is called for. One approach, called data rotting, has been proposed as an alternative solution.
For the sake of storage management and responsiveness, it lets the DBMS semi-autonomously rot away data. Rotting is based
on the systems own unwillingness to keep old data as easily accessible as fresh data. Our work sheds more light on the
opportunities and potential impacts of this radical departure in data management. Specifically, we study the case where
a DBMS selectively forgets tuples (by marking them inactive) under various amnesia scenarios and with different implementation
strategies. Our final aim is to use the findings of this study to morph an existing data management engine to serve demanding
big data scientific applications with well-chosen data amnesia algorithms.
</p>
</details>
</p>
<div class="title">Bits are Valuable: making a Frugal Database Kernel</div>
<span class="date">06.2013</span>
<span class="location">LogicBlox, Atlanta, GA</span>
<details>
<summary>Abstract</summary>
<p>
In this talk, we are going to iterate over a collection of indexes
on the making. Each index serves a different query scenario, but
they all have one common design principal. They use just a few
bits to decide if a large piece of data needs to be fetched from
storage. Transferring data costs, thus each bit that encodes
information to avoid unnecessary data transfer is valuable.
A frugal database kernel leaves no bits behind.
To give one example, we introduce column imprint, a simple but efficient
cache conscious secondary index. A column imprint is a collection of
many small bit vectors, each indexing the data points of a single
cacheline. Next, we will talk about Splited Bloom filters. In this
scenario, instead of using a single large Bloom filter, we split it
into multiple smaller Bloom filters, each one covering a separate subset
of tuples. This makes it possible to adjust the size of the filters
and use more bits for subsets that are larger and/or more frequently
accessed. Finally, we will present future ideas on how multiple bloom
filters can be used to implement unary and binary database operators that
work over extra large data sets.
</p>
</details>
</p>
<div class="title">Column Imprints: A Secondary Index Structure</div>
<span class="date">04.2013</span>
<span class="location">DIAS Lab, EPFL, Lausanne</span>
<details>
<summary>Abstract</summary>
<p>
Large scale data warehouses rely heavily on secondary indexes,
such as bitmaps and b-trees, to limit access to slow IO devices.
However, with the advent of large main memory systems, cache
conscious secondary indexes are needed to improve also the transfer
bandwidth between memory and cpu. In this presentation, we introduce
column imprint, a simple but efficient cache conscious secondary index.
A column imprint is a collection of many small bit
vectors, each indexing the data points of a single cacheline. An
imprint is used during query evaluation to limit data access and
thus minimize memory traffic. The compression for imprints is
cpu friendly and exploits the empirical observation that data often
exhibits local clustering or partial ordering as a side-effect of the
construction process. Most importantly, column imprint compression remains
effective and robust even in the case of unclustered data, while other
state-of-the-art solutions fail. We conducted an
extensive experimental evaluation to assess the applicability and
the performance impact of the column imprints. The storage over-
head, when experimenting with real world datasets, is just a few
percent over the size of the columns being indexed. The evaluation
time for over 40000 range queries of varying selectivity revealed
the efficiency of the proposed index compared to zonemaps and
bitmaps with WAH compression.
</p>
</details>
</p>
<div class="title">SciBORQ: Scientific data management with Bounds On Runtime and Quality</div>
<span class="date">01.2011</span>
<span class="location">HP Labs, Palo Alto, CA</span>
<div></div>
<span class="date">01.2011</span>
<span class="location">IBM Research Almaden, San Jose, CA</span>
<details>
<summary>Abstract</summary>
<p>
Data warehouses underlying virtual observatories stress the capabilities of
database management systems in many ways. They are filled on a daily basis with
large amounts of factual information, derived from intensive data scrubbing
and computational feature extraction pipelines. Querying these huge databases
require a sizable computing cluster, while ideally the initial investigation
should run interactively on as little resources as possible.
</p>
<p>
In this talk, we will explore a different route, based on the observation that
at any given time only a fraction of the data is of primary value for a
specific task. This fraction becomes the focus of scientific reflection through
an iterative process of ad-hoc query refinement. We will present SciBORQ, a
framework for scientific data exploration that allows precise control over the
runtime and the quality of query answering. Novel techniques are presented to
derive multiple interesting data samples, called impressions. An impression is
selected such that the statistical error of a query answer remains low, while
the result can be computed within strict time bounds. Impressions differ from
previous sampling approaches in their bias towards the focal point of the
scientific data exploration, their multi-layer design, and their adaptiveness
to shifting query workload.
</p>
</details>
<p align="right"><a href="#">back to top</a></p>
</section>
<!-- end talks-->
<!-- begin service -->
<section id=service>
<h1>Professional Service</h1>
<h2>Program Committee Member</h3>
<ul>
<li><span class="date">2017</span> Reproducibility Committee – ACM SIGMOD 2017/2018 Reproducibility
<li><span class="date">2017</span> Research Track – 43rd International Conference on Very Large Data Bases (VLDB) 2017
<li><span class="date">2016</span> Demo Track – 42nd International Conference on Very Large Data Bases (VLDB) 2016
<li><span class="date">2014</span> 2nd International workshop on Benchmarking RDF Systems (BeRSys 2014), co-located with VLDB
</ul>
<h2>Journal Reviewer</h2>
<ul>
<li><span class="date">2018</span> The VLDB Journal
<li><span class="date">2017</span> Elsevier Journal of Information Systems (IS)
<li><span class="date">2010, 2015, 2017</span> IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE)
<li><span class="date">2010, 2011, 2012, 2014</span> Elsevier Journal of Web Semantics (JWS)
<li><span class="date">2008</span> Springer Journal of Computer Science and Technology (JCST)
</ul>
<p align="right"><a href="#">back to top</a></p>
</section>
<!-- end service-->
<section id=contact>
<h1> Contact </h1>
<span class=info>Email: <a href="mailto:lsidir@gmail.com">lsidir@gmail.com</a></span>
<div><span class=info>Phone:</span> </div>
<div><span class=info>Office:</span> </div>
<div><span class=info>Address:<br></span>
<br>
<br>
<br>
</div>
<p align="right"><a href="#">back to top</a></p>
</section>
</main>
<footer>
disclaimer:
</footer>
</body>
</html>