index.html

<!DOCTYPE html>
<html lang="en-US">
    <head>
        <title>Lefteris Sidirourgos -- Λευτέρης Σιδηρουργός</title>
        <meta charset="UTF-8"/>
        <meta name="keywords" content="Lefteris, Eleftherios, Sidirourgos, Λευτέρης, Ελευθέριος, Σιδηρουργός">
        <meta name="author" content="Lefteris Sidirourgos">
        <meta name="viewport" content="width=device-width, initial-scale=1">
        <link rel="stylesheet" href="w3.css">
        <link rel="stylesheet" href="page.css">
    </head>
    <body>
        <!-- begin header -->
        <header>
			<div class="w3-content w3-display-container" style="max-width:1000px; height:160px;" >

				<div class="w3-display-left" style="display:inline-block;">
					<figure style="display:inline-block; vertical-align:middle;">
						<img class="w3-round" src="imgs/lefterios.jpg" alt="Lefteris Sidirourgos - Λευτέρης Σιδηρουργός" height="160px">
					</figure>
					<div style="display:inline-block; vertical-align:middle; text-align:left; padding:10px;">
						<div id=name>Lefteris Sidirourgos</div>
						<div id=name>Λευτέρης Σιδηρουργός</div>
						<div id=affiliation></div>
						<div id=affiliation>MADgIK, Dept of Informatics</div>
						<div id=affiliation>University of Athens</div>
					</div>
				</div>
				<div class="w3-display-right">
					<figure style="display:inline-block; vertical-align:middle;">
						<a href="http://www.madgik.di.uoa.gr/" target="_blank">
						<img src="imgs/madgik.png" alt="madgik">
						</a>
					</figure>
				</div>
			</div>
        </header>
        <!-- end header -->
        <nav>
			<div class="w3-content" style="max-width:1000px">
				<div style="display:flex; justify-content:space-around;">
					<a href="#">research</a>
					<a href="#publications">publications</a>
					<a href="#bio">bio</a>
					<a href="#talks">talks</a>
					<a href="#service">service</a>
					<a href="#contact">contact</a>
				</div>
			</div>
        </nav>
        <main class="w3-content" style="max-width:1000px" >
			<!--begin of research-->
            <section id=research>
                <p style="text-align:left">
					My research area of interest is in Database Architecture. I obtained a PhD from University of
					Amsterdam while I was a member of the Database Architecture Group of CWI in Amsterdam. I have
					been a core developer for the MonetDB open source database system since 2007.
                </p>
				<p>
				Previous affiliations include:
				<ul>
					<li><span class="date">2017-2018</span> Post-Doctoral Researcher at the 
						<a href="https://www.systems.ethz.ch/" target="_blank">Systems Group</a>
					of ETH Zürich.
					<a href="https://www.ethz.ch/en.html" style="display:inline-block; vertical-align:top;" target="_blank">
						<img src="imgs/ethz_logo_black.svg" alt="eth" height="20px">
					</a>
					</li>
				<li><span class="date">2007-2017</span> Researcher at the
					<a href="https://www.cwi.nl/research/groups/database-architectures" target="_blank">
						Database Architectures
					</a>
				group of CWI, Amsterdam.
					<a href="https://www.cwi.nl" style="display:inline-block; vertical-align:middle;" target="_blank">
						<img src="imgs/cwi_logo.png" alt="cwi" height="40px">
					</a>
				</li>
				<li>
				In the summer of <span class="date">2012</span>, I was a research intern at 
				<a href="https://www.microsoft.com/en-us/research/lab/microsoft-research-redmond/" target="_blank">
				Microsoft Research Lab</a>, in Redmond, WA.
				</li>
				<li><span class="date">2001-2005</span> Member of the 
				<a href="http://www.ics.forth.gr/isl/index_main.php?l=e&c=253" target="_blank">
				Information Systems Laboratory
				</a> at ICS-FORTH, Heraklion, Crete.
					<a href="http://www.ics.forth.gr/index.html" style="display:inline-block; vertical-align:middle;" target="_blank">
						<img src="imgs/icsforth_logo.jpg" alt="ics-forth" height="50px">
					</a>
				</li>
				</ul>
				</p>
				<p>
				My github <a href="https://github.com/lsidir" target="_blank">profile <img src="imgs/github_logo.png" alt="github" height="30px"></a> with various repositories such as the stand-alone/simd implementation of <a href="https://github.com/lsidir/imprints" target="_blank">Column Imprints</a>.
				</p>
				<p>
					Most of my free time at work is consumed by
					<a href="https://www.monetdb.org/Home" style=" vertical-align:middle;" target="_blank">
						<img src="imgs/monetdblogo.png" alt="monetdb" height="50px">
					MonetDB: The column-store pioneer</a>. My contributions to MonetDB include supporting/maintaining the GDK kernel, the join, select, and sampling
					operators, bug fixing, the original implementation of column imprints and ordered index, as well as
					code for supporting non-relational data models such as RDF and XML.
				</p>
				<p>My PhD research was funded by the Netherlands Organisation for Scientific Research &nbsp;<a href="https://www.nwo.nl/en/" target="_blank"><img src="imgs/nwo_logo.png" alt="nwo" height="30px"></a>&nbsp; under the project <a href="http://www.narcis.nl/research/RecordID/OND1336213/Language/en" target="_blank">Querying while Transforming Large Graph Databases</a>.
				</p>
				<p align="right"><a href="#">back to top</a></p>
            </section>
			<!-- end of research-->
            <section id=publications class=publications>
			<h1>Publications</h1>
			<a href="http://dl.acm.org/author_page.cfm?id=81388596471" target="_blank" rel="external">ACM Digital Library</a>
			&nbsp;-&nbsp;
			<a href="http://dblp2.uni-trier.de/pers/hd/s/Sidirourgos:Lefteris" target="_blank" rel="external">DBLP</a>
			&nbsp;-&nbsp;
			<a href="http://scholar.google.ch/citations?user=SAJM3PkAAAAJ&hl=en" target="_blank" rel="external">Google Scholar</a>
			<!--
			<h2>2017</h2>
			<table id=pub>
			<tr>
			<td>[c9]</td>
			<td>
			Sidirourgos, Lefteris, and Hannes M\"uhleisen. <i>"Scaling Column Imprints using Advanced Vectorization."</i> 13th International Workshop on Data Management on New Hardware (DaMoN 2017). In proceedings of the 2017 ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD'17). ACM, 2017
			</td>
			</tr>
			</table>
			<h2>2013</h2>
			<h2>2012</h2>
			<h2>2008</h2>
			N2. Sidirourgos Lefteris, Column-Store Support for RDF Data Management: not all swans are white. In Dutch-Belgian Database Day. Namur, Belgium, Oct 10, 2008</p>
			N3.Θεσιακός Χειρισμός των Ενημερώσεων Τιμών σε Βάσεις Στηλικής Αποθήκευσης
Sandor Heman (Vectorwise), Marcin Zukowski (Vectorwise), Niels Nes (Centrum Wiskunde &amp;
Informatica), Λευτέρης Σιδηρουργός (Centrum Wiskunde &amp; Informatica), Peter Boncz
(Centrum Wiskunde &amp; Informatica)
-->
				<p align="right"><a href="#">back to top</a></p>
            </section>
			<section id=bio>
			<h1> Bio </h1>
				<!--p><a href="cv/sidirourgos_cv_gr.pdf" target="_blank" download>Βιογραφικό</a> </p-->
				<p>
					Lefteris Sidirourgos is a postdoctoral researcher at the Systems Group of ETH Zürich. Before that, he was a postdoctoral
					researcher at the Database Architectures Group of CWI in Amsterdam. He received his Ph.D. in Computer Science from the
					University of Amsterdam. He also holds a M.Sc. in Computer Science and a B.Sc in Mathematics from University of Crete, Greece.
					He has been a core developer for the MonetDB open source database system for the past 10 years. He has also worked with
					other commercial database engines such as Microsoft Hekaton, Actian Vectorwise, and LogicBlox. His main research interests
					include indexing, storage, compression, sampling, approximate query processing, and graph processing. He always
					looks for opportunities to expand his research into new areas of computer science.
				</p>
				<p><a href="cv/sidirourgos_cv.pdf" target="_blank" download>Curriculum Vitae</a></p>
				<p align="right"><a href="#">back to top</a></p>
			</section>
			<!-- begin talks-->
			<section id=talks>
			<h1>Invited Talks</h1>
			<div class="title">A Nexus of Indexes</div>
			<span class="date">11.2016</span>
			<span class="location">Systems Group, ETH Zürich</span>
			<details>
			<summary>Abstract</summary>
			<p>
				The large memories available for read-optimized databases have changed the landscape of index creation and exploitation.
				Instead of complex data structures aimed at fast concurrent OLTP updates, modern column stores rely on fast scans over
				(sorted) partitions and/or the use of partial indices. In this talk we will present two such indexes, namely imprints
				and secondary projections. These indexes are designed to alleviate the storage duplication caused by complex primary
				indexes and support a compact representation of intermediates in the order required for the remainder of the query
				execution plan. We will also present our vision for a nexus of indexes, where an index is not a stand-alone structure
				inside a database engine, but part of a series of connections linking two or more indexes. Finally, we will discuss
				future work on hardware-driven index design that extends across the three layers of shared-memory programming.
			</p>
			</details>
			</p>
			<div class="title">A Database System with Amnesia</div>
			<span class="date">09.2016</span>
			<span class="location">CWI Scientific Meeting, Amsterdam</span>
			<details>
			<summary>Abstract</summary>
			<p>
				Big Data comes with huge challenges. Its volume and velocity makes handling, curating, and analytical processing a
				costly affair. Even to simply “look at” the data within an a priori defined budget and with a guaranteed interactive
				response time frame might be impossible. Scale-out approaches will hit the technology and monetary wall soon,
				if not done so already. Blindly dumping data when the channel is full, or reducing the data resolution at the source,
				might lead to loss of valuable observations. An army of well-educated database administrators or full software
				stack architects might deal with the challenges, but a seemingly knobless DBMS is to be preferred. A fundamental change
				in database management is called for. One approach, called data rotting, has been proposed as an alternative solution.
				For the sake of storage management and responsiveness, it lets the DBMS semi-autonomously rot away data. Rotting is based
				on the systems own unwillingness to keep old data as easily accessible as fresh data. Our work sheds more light on the
				opportunities and potential impacts of this radical departure in data management. Specifically, we study the case where
				a DBMS selectively forgets tuples (by marking them inactive) under various amnesia scenarios and with different implementation
				strategies. Our final aim is to use the findings of this study to morph an existing data management engine to serve demanding
				big data scientific applications with well-chosen data amnesia algorithms.
			</p>
			</details>
			</p>
			<div class="title">Bits are Valuable: making a Frugal Database Kernel</div>
			<span class="date">06.2013</span>
			<span class="location">LogicBlox, Atlanta, GA</span>
			<details>
			<summary>Abstract</summary>
			<p>
				In this talk, we are going to iterate over a collection of indexes
				on the making. Each index serves a different query scenario, but
				they all have one common design principal. They use just a few
				bits to decide if a large piece of data needs to be fetched from
				storage. Transferring data costs, thus each bit that encodes
				information to avoid unnecessary data transfer is valuable.
				A frugal database kernel leaves no bits behind.
				To give one example, we introduce column imprint, a simple but efficient
				cache conscious secondary index. A column imprint is a collection of
				many small bit vectors, each indexing the data points of a single
				cacheline. Next, we will talk about Splited Bloom filters. In this
				scenario, instead of using a single large Bloom filter, we split it
				into multiple smaller Bloom filters, each one covering a separate subset
				of tuples. This makes it possible to adjust the size of the filters
				and use more bits for subsets that are larger and/or more frequently
				accessed. Finally, we will present future ideas on how multiple bloom
				filters can be used to implement unary and binary database operators that
				work over extra large data sets.
			</p>
			</details>
			</p>
			<div class="title">Column Imprints: A Secondary Index Structure</div>
			<span class="date">04.2013</span>
			<span class="location">DIAS Lab, EPFL, Lausanne</span>
			<details>
			<summary>Abstract</summary>
			<p>
			Large scale data warehouses rely heavily on secondary indexes,
			such as bitmaps and b-trees, to limit access to slow IO devices.
			However, with the advent of large main memory systems, cache
			conscious secondary indexes are needed to improve also the transfer
			bandwidth between memory and cpu. In this presentation, we introduce
			column imprint, a simple but efficient cache conscious secondary index.
			A column imprint is a collection of many small bit
			vectors, each indexing the data points of a single cacheline. An
			imprint is used during query evaluation to limit data access and
			thus minimize memory traffic. The compression for imprints is
			cpu friendly and exploits the empirical observation that data often
			exhibits local clustering or partial ordering as a side-effect of the
			construction process. Most importantly, column imprint compression remains
			effective and robust even in the case of unclustered data, while other
			state-of-the-art solutions fail. We conducted an
			extensive experimental evaluation to assess the applicability and
			the performance impact of the column imprints. The storage over-
			head, when experimenting with real world datasets, is just a few
			percent over the size of the columns being indexed. The evaluation
			time for over 40000 range queries of varying selectivity revealed
			the efficiency of the proposed index compared to zonemaps and
			bitmaps with WAH compression.
			</p>
			</details>
			</p>
			<div class="title">SciBORQ: Scientific data management with Bounds On Runtime and Quality</div>
			<span class="date">01.2011</span>
			<span class="location">HP Labs, Palo Alto, CA</span>
			<div></div>
			<span class="date">01.2011</span>
			<span class="location">IBM Research Almaden, San Jose, CA</span>
			<details>
			<summary>Abstract</summary>
			<p>
				Data warehouses underlying virtual observatories stress the capabilities of
				database management systems in many ways. They are filled on a daily basis with
				large amounts of factual information, derived from intensive data scrubbing
				and computational feature extraction pipelines. Querying these huge databases
				require a sizable computing cluster, while ideally the initial investigation
				should run interactively on as little resources as possible.
			</p>
			<p>
				In this talk, we will explore a different route, based on the observation that
				at any given time only a fraction of the data is of primary value for a
				specific task. This fraction becomes the focus of scientific reflection through
				an iterative process of ad-hoc query refinement. We will present SciBORQ, a
				framework for scientific data exploration that allows precise control over the
				runtime and the quality of query answering. Novel techniques are presented to
				derive multiple interesting data samples, called impressions. An impression is
				selected such that the statistical error of a query answer remains low, while
				the result can be computed within strict time bounds. Impressions differ from
				previous sampling approaches in their bias towards the focal point of the
				scientific data exploration, their multi-layer design, and their adaptiveness
				to shifting query workload.
			</p>
			</details>
				<p align="right"><a href="#">back to top</a></p>
			</section>
			<!-- end talks-->
			<!-- begin service -->
			<section id=service>
				<h1>Professional Service</h1>
				<h2>Program Committee Member</h3>
				<ul>
					<li><span class="date">2017</span> Reproducibility Committee – ACM SIGMOD 2017/2018 Reproducibility
					<li><span class="date">2017</span> Research Track – 43rd International Conference on Very Large Data Bases (VLDB) 2017
					<li><span class="date">2016</span> Demo Track – 42nd International Conference on Very Large Data Bases (VLDB) 2016
					<li><span class="date">2014</span> 2nd International workshop on Benchmarking RDF Systems (BeRSys 2014), co-located with VLDB
				</ul>
				<h2>Journal Reviewer</h2>
				<ul>
					<li><span class="date">2018</span> The VLDB Journal
					<li><span class="date">2017</span> Elsevier Journal of Information Systems (IS)
					<li><span class="date">2010, 2015, 2017</span> IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE)
					<li><span class="date">2010, 2011, 2012, 2014</span> Elsevier Journal of Web Semantics (JWS)
					<li><span class="date">2008</span> Springer Journal of Computer Science and Technology (JCST)
				</ul>
				<p align="right"><a href="#">back to top</a></p>
			</section>
			<!-- end service-->
            <section id=contact>
				<h1> Contact </h1>
                <span class=info>Email: <a href="mailto:lsidir@gmail.com">lsidir@gmail.com</a></span>
                <div><span class=info>Phone:</span> </div>
                <div><span class=info>Office:</span> </div>
                <div><span class=info>Address:<br></span>
				<br>
				<br>
				<br>
				</div>
				<p align="right"><a href="#">back to top</a></p>
            </section>

        </main>
        <footer>
            disclaimer:
        </footer>
    </body>
</html>