Skip to content
This repository has been archived by the owner on Feb 16, 2023. It is now read-only.

dirent: Very truncated list #6

Open
donatj opened this issue Jul 13, 2020 · 0 comments
Open

dirent: Very truncated list #6

donatj opened this issue Jul 13, 2020 · 0 comments

Comments

@donatj
Copy link

donatj commented Jul 13, 2020

I have a directory with 28,282,746 files in it on a CentOS system.

The file system is XFS which should easily be able to handle it.

I have essentially the following code trying to scan the file names from the directory

Code Example
package main

import (
	"bytes"
	"flag"
	"fmt"
	"io"

	"github.com/EricLagergren/go-gnulib/dirent"
)

func int8ToString(s []int8) string {
	var buff bytes.Buffer
	for _, chr := range s {
		if chr == 0x00 {
			break
		}
		buff.WriteByte(byte(chr))
	}
	return buff.String()
}

func main() {
	flag.Parse()

	stream, err := dirent.Open(flag.Arg(0))
	if err != nil {
		panic(err)
	}
	defer stream.Close()
	for {
		entry, err := stream.Read()
		if err != nil {
			if err == io.EOF {
				break
			}
			panic(err)
		}

		name := int8ToString(entry.Name[:])
		fmt.Println(name)
	}
}

however I am only getting 63 files listed.

I have been poking around in the func (s *Stream) Read() (*unix.Dirent, error) { method, the EOF I am receiving is from

	dirent := *(*unix.Dirent)(unsafe.Pointer(&s.buf[s.bp]))
	if dirent.Reclen == 0 {
		return nil, io.EOF
	}

I added a log.Println("%#v", dirent) just before the return and received

2020/07/13 18:00:13 unix.Dirent{Ino:0x0, Off:0, Reclen:0x0, Type:0x0, Name:[256]int8{0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, _:[5]uint8{0x0, 0x0, 0x0, 0x0, 0x0}}                   

Which to my eye appears pretty uninitialized.

I tried moving

	if isAbsent(&dirent) {
		return s.Read()
	}

to immediately follow dirent := *(*unix.Dirent)(unsafe.Pointer(&s.buf[s.bp])) as I thought that might be an oversight but this seemed to just cause an infinite recursion.

I added log.Println("getting dents", len(s.buf)) before the call to unix.Getdents to ensure that the buffer size was never zero, it is 4096, HOWEVER I noticed it only ever triggers once.

I added:

log.Println("exiting", s.bp)

Immediately before the return return nil, io.EOF and receive 2020/07/13 18:24:01 exiting 4080 which is not >= 4096 and thus never runs unix.Getdents more than once. I suspect this implies some sort of underflow or overflow issue in the code, but I'm not sure and I think this is beyond me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant