file虚表函数学习

引*

glibc中有各种vtable,fileops,strops,wfileops等等

更多的的可以在glibc/libio/vtables.c查看,不同vtable中对应的函数实现也不同,不过都是为io服务,都与_IO_FILE有关

本文主要研究file虚表函数,其主要集中在fileops.c中

fileops.c文件开头这一段注释提供了不少信息

认真阅读能提供不少帮助

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
/* An fstream can be in at most one of put mode, get mode, or putback mode.
Putback mode is a variant of get mode.

In a filebuf, there is only one current position, instead of two
separate get and put pointers. In get mode, the current position
is that of gptr(); in put mode that of pptr().

The position in the buffer that corresponds to the position
in external file system is normally _IO_read_end, except in putback
mode, when it is _IO_save_end and also when the file is in append mode,
since switching from read to write mode automatically sends the position in
the external file system to the end of file.
If the field _fb._offset is >= 0, it gives the offset in
the file as a whole corresponding to eGptr(). (?)

PUT MODE:
If a filebuf is in put mode, then all of _IO_read_ptr, _IO_read_end,
and _IO_read_base are equal to each other. These are usually equal
to _IO_buf_base, though not necessarily if we have switched from
get mode to put mode. (The reason is to maintain the invariant
that _IO_read_end corresponds to the external file position.)
_IO_write_base is non-NULL and usually equal to _IO_buf_base.
We also have _IO_write_end == _IO_buf_end, but only in fully buffered mode.
The un-flushed character are those between _IO_write_base and _IO_write_ptr.

GET MODE:
If a filebuf is in get or putback mode, eback() != egptr().
In get mode, the unread characters are between gptr() and egptr().
The OS file position corresponds to that of egptr().

PUTBACK MODE:
Putback mode is used to remember "excess" characters that have
been sputbackc'd in a separate putback buffer.
In putback mode, the get buffer points to the special putback buffer.
The unread characters are the characters between gptr() and egptr()
in the putback buffer, as well as the area between save_gptr()
and save_egptr(), which point into the original reserve buffer.
(The pointers save_gptr() and save_egptr() are the values
of gptr() and egptr() at the time putback mode was entered.)
The OS position corresponds to that of save_egptr().

LINE BUFFERED OUTPUT:
During line buffered output, _IO_write_base==base() && epptr()==base().
However, ptr() may be anywhere between base() and ebuf().
This forces a call to filebuf::overflow(int C) on every put.
If there is more space in the buffer, and C is not a '\n',
then C is inserted, and pptr() incremented.

UNBUFFERED STREAMS:
If a filebuf is unbuffered(), the _shortbuf[1] is used as the buffer.
*/

大致如下

  1. 文件流的模式:文件流可以处于put模式、get模式或putback模式中。Putback模式是get模式的一种变体。
  2. 文件缓冲区中的当前位置:在文件缓冲区中,只有一个当前位置,而不是分别有get指针和put指针。在get模式中,当前位置是gptr()的位置;在put模式中,当前位置是pptr()的位置。
  3. 缓冲区位置与外部文件系统位置的对应:通常情况下,缓冲区中与外部文件系统位置对应的位置是_IO_read_end,但在putback模式下,它是_IO_save_end,并且在文件处于附加模式时也是_IO_save_end。这是因为从读模式切换到写模式会自动将外部文件系统位置切换到文件的末尾。如果字段_fb._offset >= 0,则它表示与eGptr()对应的文件整体偏移。
  4. PUT模式:在put模式下,_IO_read_ptr、_IO_read_end和_IO_read_base都相等。它们通常等于_IO_buf_base,但如果从get模式切换到put模式,它们不一定相等。_IO_write_base不为空,通常等于_IO_buf_base。_IO_write_end等于_IO_buf_end,但只在完全缓冲模式下成立未刷新的字符位于_IO_write_base和_IO_write_ptr之间。
  5. GET模式:在get或putback模式下,eback() != egptr()。在get模式中,未读字符位于gptr()和egptr()之间。操作系统文件位置对应于egptr()的位置。
  6. PUTBACK模式:putback模式用于记住已经通过sputbackc放回的“多余”字符,它们存储在特殊的putback缓冲区中。在putback模式中,get缓冲区指向特殊的putback缓冲区。未读字符包括putback缓冲区中gptr()和egptr()之间的字符,以及指向原始预留缓冲区的save_gptr()和save_egptr()之间的区域。操作系统位置对应于save_egptr()的位置。
  7. 行缓冲输出:在行缓冲输出期间,_IO_write_base等于base(),并且epptr()也等于base()。但是,ptr()可能位于base()和ebuf()之间。这会导致在每次放入字符时调用filebuf::overflow(int C)。如果缓冲区中还有更多空间(pptr<ebuf),并且C不是’\n’,则会插入C,并增加pptr(),否则刷新写入。
  8. 无缓冲流:如果文件缓冲区是unbuffered(),则_shortbuf[1]用作缓冲区。

相关系统调用

lseek

seek 是一个用于文件操作的系统调用,它的主要功能是用于改变文件指针的位置,从而实现对文件的随机访问。具体来说,seek 的功能包括:

  1. 定位文件指针:seek 允许你将文件指针(读/写位置)移动到文件中的任意位置。这是对文件进行随机访问的关键操作。你可以指定要移动到的位置,通常是相对于文件开头的偏移量。
  2. 读取和写入特定位置:通过改变文件指针的位置,你可以在文件中的任何位置进行读取和写入操作,而不必按照顺序逐个字节进行操作。这对于访问大型文件或数据库非常有用。
  3. 支持文件的随机访问:seek 是实现随机访问的关键,允许你在不必按照文件顺序读取数据的情况下,快速访问和处理文件的各个部分。
  4. 实现文件的截断和扩展:在某些情况下,seek 可以用于截断文件(减小文件大小)或扩展文件(增大文件大小)。通过移动文件指针并写入数据,你可以实现这些操作。

其返回值表示成功执行操作后的文件偏移量,如果出现错误,返回值会是 -1

具体来说,lseek 的原型如下:

1
off_t lseek(int fd, off_t offset, int whence);
  • fd 是文件描述符,用于指定要进行定位操作的文件。
  • offset 是一个偏移量,用于指定要移动的相对位置。可以为正数、负数或零,具体取决于 whence 参数的值。
  • whence 用于确定偏移量的基准位置,通常可以取以下值之一:
    • SEEK_SET:以文件开头为基准,offset 指定的位置。
    • SEEK_CUR:以当前文件位置为基准,增加 offset 指定的位置。
    • SEEK_END:以文件末尾为基准,增加 offset 指定的位置。

lseek 函数会根据 offsetwhence 的指定值来移动文件描述符 fd 的偏移位置,并返回新的文件偏移位置。如果操作成功,返回值是新的偏移位置。如果出现错误,返回值是 -1

1
2
3
#define _IO_seek_set 0
#define _IO_seek_cur 1
#define _IO_seek_end 2

sync

sync 是一个系统调用,它的主要功能是将操作系统内核中尚未写入磁盘的缓冲区数据强制刷新到磁盘上的存储设备,以确保数据持久性和文件系统的一致性sync 的主要功能包括:

  1. 数据持久性:通过执行 sync,操作系统会将所有尚未写入磁盘的数据写入到物理存储设备中。这可以确保即使系统崩溃或断电,尚未写入磁盘的数据也不会丢失。
  2. 文件系统一致性:sync 也有助于维护文件系统的一致性。在写入文件和目录信息时,文件系统通常会维护内部数据结构,这些数据结构需要及时写入磁盘以确保文件系统的一致性。sync 确保这些数据结构及其相关的数据被写入磁盘。
  3. 缓冲区刷新:sync 还用于刷新内核中的缓冲区,以确保缓冲区中的数据被写入磁盘。这对于正在进行的文件操作和文件系统操作非常重要,因为数据通常首先存储在内存中以提高性能,然后定期刷新到磁盘上。
  4. 数据完整性:sync 还有助于维护数据的完整性。它确保了所有写入的数据都已经被持久地存储在磁盘上,以免数据损坏或丢失。

stat

系统调用 stat 用于获取关于文件或目录的信息,如文件的大小、访问权限、所属用户和组、文件类型等。它返回一个包含文件信息的结构体,通常被称为 struct stat

stat 系统调用的功能包括:

  1. 获取文件的基本属性:stat 可以用来获取文件的基本属性,如文件大小、创建时间、修改时间、访问时间等。

  2. 获取文件的权限信息:stat 可以提供文件的权限信息,包括文件的拥有者、所属组以及其他用户的权限。

  3. 确定文件的类型:stat 可以告诉您文件是普通文件、目录、符号链接还是其他类型的文件。

  4. 获取文件的相关信息:stat 可以提供有关文件系统的信息,如文件系统的块大小、设备号等。

1
2
3
4
5
6
7
8
9
10
11
12
13
struct _stat64 {
_dev_t st_dev;//文件所在的设备的标识符
_ino_t st_ino;//文件的 inode 号
unsigned short st_mode;//文件的权限和类型信息
short st_nlink;//文件的硬链接数目
short st_uid;//文件的用户标识符 (UID)
short st_gid;//文件的组标识符 (GID)
_dev_t st_rdev;//特殊文件的设备标识符
__MINGW_EXTENSION __int64 st_size;//文件的大小,以字节为单位
__time64_t st_atime;//最后访问时间
__time64_t st_mtime;//最后修改时间
__time64_t st_ctime;//状态改变时间
};

其他

其他还有用到read,write,open,close都较为熟悉就不记录了

IO虚表函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
versioned_symbol (libc, _IO_new_do_write, _IO_do_write, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_attach, _IO_file_attach, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_close_it, _IO_file_close_it, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_finish, _IO_file_finish, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_fopen, _IO_file_fopen, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_init, _IO_file_init, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_setbuf, _IO_file_setbuf, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_sync, _IO_file_sync, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_overflow, _IO_file_overflow, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_seekoff, _IO_file_seekoff, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_underflow, _IO_file_underflow, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_write, _IO_file_write, GLIBC_2_1);
versioned_symbol (libc, _IO_new_file_xsputn, _IO_file_xsputn, GLIBC_2_1);

const struct _IO_jump_t _IO_file_jumps libio_vtable =
{
JUMP_INIT_DUMMY,
JUMP_INIT(finish, _IO_file_finish),
JUMP_INIT(overflow, _IO_file_overflow),
JUMP_INIT(underflow, _IO_file_underflow),
JUMP_INIT(uflow, _IO_default_uflow),
JUMP_INIT(pbackfail, _IO_default_pbackfail),
JUMP_INIT(xsputn, _IO_file_xsputn),
JUMP_INIT(xsgetn, _IO_file_xsgetn),
JUMP_INIT(seekoff, _IO_new_file_seekoff),
JUMP_INIT(seekpos, _IO_default_seekpos),
JUMP_INIT(setbuf, _IO_new_file_setbuf),
JUMP_INIT(sync, _IO_new_file_sync),
JUMP_INIT(doallocate, _IO_file_doallocate),
JUMP_INIT(read, _IO_file_read),
JUMP_INIT(write, _IO_new_file_write),
JUMP_INIT(seek, _IO_file_seek),
JUMP_INIT(close, _IO_file_close),
JUMP_INIT(stat, _IO_file_stat),
JUMP_INIT(showmanyc, _IO_default_showmanyc),
JUMP_INIT(imbue, _IO_default_imbue)
};
libc_hidden_data_def (_IO_file_jumps)

finish

finish的主要功能是关闭缓冲区,解除文件流在_IO_list_all中的链接

1 _IO_new_file_finish

1
2
3
4
5
6
7
8
9
10
11
12
void
_IO_new_file_finish (FILE *fp, int dummy)
{
if (_IO_file_is_open (fp))
{
_IO_do_flush (fp);
if (!(fp->_flags & _IO_DELETE_DONT_CLOSE))
_IO_SYSCLOSE (fp);
}
_IO_default_finish (fp, 0);
}
libc_hidden_ver (_IO_new_file_finish, _IO_file_finish)
  1. 先是检查文件是否打开,是则调用_IO_do_flush (fp)并根据 _IO_DELETE_DONT_CLOSE标志位决定是否调用close关闭文件流
  2. _IO_default_finish (fp, 0);

2-1 _IO_do_flush

1
2
3
4
5
6
7
#define _IO_do_flush(_f) \
((_f)->_mode <= 0 \
? _IO_do_write(_f, (_f)->_IO_write_base, \
(_f)->_IO_write_ptr-(_f)->_IO_write_base) \
: _IO_wdo_write(_f, (_f)->_wide_data->_IO_write_base, \
((_f)->_wide_data->_IO_write_ptr \
- (_f)->_wide_data->_IO_write_base)))

根据是是否是宽字节有两个分支

目前先看非宽字节分支,调用的是另一个虚表函数write,这里先不写

2-2 _IO_default_finish

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
void
_IO_default_finish (FILE *fp, int dummy)
{
struct _IO_marker *mark;
if (fp->_IO_buf_base && !(fp->_flags & _IO_USER_BUF))
{
free (fp->_IO_buf_base);
fp->_IO_buf_base = fp->_IO_buf_end = NULL;
}

for (mark = fp->_markers; mark != NULL; mark = mark->_next)
mark->_sbuf = NULL;

if (fp->_IO_save_base)
{
free (fp->_IO_save_base);
fp->_IO_save_base = NULL;
}

_IO_un_link ((struct _IO_FILE_plus *) fp);

#ifdef _IO_MTSAFE_IO
if (fp->_lock != NULL)
_IO_lock_fini (*fp->_lock);
#endif
}
libc_hidden_def (_IO_default_finish)
  1. 若buf不为空且_IO_USER_BUF标志为0,释放free缓冲区,并置空buf指针
  2. 将文件流的所有marker的_sbuf字段清空
  3. 若文件流的_IO_save_base不为空,则将其free释放并置空
  4. 调用unlink将文件流解除_IO_list_all链
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
void
_IO_un_link (struct _IO_FILE_plus *fp)
{
if (fp->file._flags & _IO_LINKED)
{
FILE **f;
#ifdef _IO_MTSAFE_IO
_IO_cleanup_region_start_noarg (flush_cleanup);
_IO_lock_lock (list_all_lock);
run_fp = (FILE *) fp;
_IO_flockfile ((FILE *) fp);
#endif
if (_IO_list_all == NULL)
;
else if (fp == _IO_list_all)
_IO_list_all = (struct _IO_FILE_plus *) _IO_list_all->file._chain;
else
for (f = &_IO_list_all->file._chain; *f; f = &(*f)->_chain)
if (*f == (FILE *) fp)
{
*f = fp->file._chain;
break;
}
fp->file._flags &= ~_IO_LINKED;
#ifdef _IO_MTSAFE_IO
_IO_funlockfile ((FILE *) fp);
run_fp = NULL;
_IO_lock_unlock (list_all_lock);
_IO_cleanup_region_end (0);
#endif
}
  1. 先确认文件流在_IO_list_all链中
  2. 寻找_IO_list_all链中的fp将其解链并清除_IO_LINKED标志位

overflow

overflow 主要负责将数据写入底层文件(或设备)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
int
_IO_new_file_overflow (FILE *f, int ch)
{
if (f->_flags & _IO_NO_WRITES) /* SET ERROR */
{
f->_flags |= _IO_ERR_SEEN;
__set_errno (EBADF);
return EOF;
}
/* If currently reading or no buffer allocated. */
if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0 || f->_IO_write_base == NULL)
{
/* Allocate a buffer if needed. */
if (f->_IO_write_base == NULL)
{
_IO_doallocbuf (f);
_IO_setg (f, f->_IO_buf_base, f->_IO_buf_base, f->_IO_buf_base);
}
/* Otherwise must be currently reading.
If _IO_read_ptr (and hence also _IO_read_end) is at the buffer end,
logically slide the buffer forwards one block (by setting the
read pointers to all point at the beginning of the block). This
makes room for subsequent output.
Otherwise, set the read pointers to _IO_read_end (leaving that
alone, so it can continue to correspond to the external position). */
if (__glibc_unlikely (_IO_in_backup (f)))
{
size_t nbackup = f->_IO_read_end - f->_IO_read_ptr;
_IO_free_backup_area (f);
f->_IO_read_base -= MIN (nbackup,
f->_IO_read_base - f->_IO_buf_base);
f->_IO_read_ptr = f->_IO_read_base;
}

if (f->_IO_read_ptr == f->_IO_buf_end)
f->_IO_read_end = f->_IO_read_ptr = f->_IO_buf_base;
f->_IO_write_ptr = f->_IO_read_ptr;
f->_IO_write_base = f->_IO_write_ptr;
f->_IO_write_end = f->_IO_buf_end;
f->_IO_read_base = f->_IO_read_ptr = f->_IO_read_end;

f->_flags |= _IO_CURRENTLY_PUTTING;
if (f->_mode <= 0 && f->_flags & (_IO_LINE_BUF | _IO_UNBUFFERED))
f->_IO_write_end = f->_IO_write_ptr;
}
if (ch == EOF)
return _IO_do_write (f, f->_IO_write_base,
f->_IO_write_ptr - f->_IO_write_base);
if (f->_IO_write_ptr == f->_IO_buf_end ) /* Buffer is really full */
if (_IO_do_flush (f) == EOF)
return EOF;
*f->_IO_write_ptr++ = ch;
if ((f->_flags & _IO_UNBUFFERED)
|| ((f->_flags & _IO_LINE_BUF) && ch == '\n'))
if (_IO_do_write (f, f->_IO_write_base,
f->_IO_write_ptr - f->_IO_write_base) == EOF)
return EOF;
return (unsigned char) ch;
}
libc_hidden_ver (_IO_new_file_overflow, _IO_file_overflow)
  1. 判断文件流是否设置_IO_NO_WRITES,是则标记错误并返回
    • 如果文件不处于写入模式(_IO_CURRENTLY_PUTTING)或者_IO_write_base为空
      • 如果是_IO_write_base为空的情况,先通过_IO_doallocbuf申请buf缓冲区,并设置read指针
      • 如果文件流正在备份(_IO_IN_BACKUP)
        • 调用_IO_free_backup_area (f)
        • f->_IO_read_base减去f->_IO_read_end - f->_IO_read_ptr和f->_IO_read_base - f->_IO_buf_base中更小的那个
        • f->_IO_read_ptr = f->_IO_read_base;
      • 如果f->_IO_read_ptr == f->_IO_buf_end,将f->_IO_read_end = f->_IO_read_ptr = f->_IO_buf_base;
      • 将w-ptr和w-base设置为r-ptr,w-end设置为b-end,r-base和r-ptr设置为r-end(之后这些指针一般用简写)
      • 设置_IO_CURRENTLY_PUTTING位
      • 若文件流是行缓冲或无缓冲模式且非宽字符w-end=w-ptr
  2. 如果参数ch为EOF,调用_IO_do_write
  3. 如果w-ptr==b-end,调用_IO_do_flush如果返回EOF则直接return EOF
  4. *f->_IO_write_ptr++=ch
  5. 如果文件是无缓冲或者是行缓冲且ch为’\n’,调用_IO_do_write如果返回EOF则直接return EOF
  6. 返回ch

write

1
2
3
4
5
6
7
int
_IO_new_do_write (FILE *fp, const char *data, size_t to_do)
{
return (to_do == 0
|| (size_t) new_do_write (fp, data, to_do) == to_do) ? 0 : EOF;
}
libc_hidden_ver (_IO_new_do_write, _IO_do_write)

如果to_do==0直接返回1

否则调用new_do_write (fp, data, to_do)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
static size_t
new_do_write (FILE *fp, const char *data, size_t to_do)
{
size_t count;
if (fp->_flags & _IO_IS_APPENDING)
/* On a system without a proper O_APPEND implementation,
you would need to sys_seek(0, SEEK_END) here, but is
not needed nor desirable for Unix- or Posix-like systems.
Instead, just indicate that offset (before and after) is
unpredictable. */
fp->_offset = _IO_pos_BAD;
else if (fp->_IO_read_end != fp->_IO_write_base)
{
off64_t new_pos
= _IO_SYSSEEK (fp, fp->_IO_write_base - fp->_IO_read_end, 1);//读写平衡
if (new_pos == _IO_pos_BAD)
return 0;
fp->_offset = new_pos;
}
count = _IO_SYSWRITE (fp, data, to_do);
if (fp->_cur_column && count)//计算最后一行第几列
fp->_cur_column = _IO_adjust_column (fp->_cur_column - 1, data, count) + 1;
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_buf_base;
fp->_IO_write_end = (fp->_mode <= 0
&& (fp->_flags & (_IO_LINE_BUF | _IO_UNBUFFERED))
? fp->_IO_buf_base : fp->_IO_buf_end);
return count;
}
  • 如果_IO_IS_APPENDING被置位,说明文件对象是以追加方式打开的,所以将fp->_offset赋值为_IO_pos_BAD,即定位到文件末尾;

  • 如果不是追加模式,就要考虑读写buffer块地址的信息了,读的尾指针不等于写的基指针,说明之前读写过程不一致,现在我们需要写入信息,所以需要调用_IO_SYSSEEK进行调整,基于当前的位置(1表示SEEK_CUR)将两者调整到一致。

    • 如果返回结果是异常的-1,那就直接返回0,表示写入字节数为0.
    • 否则使用新的位置信息更新fp->_offset
  • 调用系统调用_IO_SYSWRITE (fp, data, to_do);
  • 如果当前列参数不等于0(即第一列),而且写入的字符数不等于0,此时需要更新列参数,调用_IO_adjust_column函数实现。
  • 调用_IOsetg将读相关的base、ptr、end更新为_IO_buf_base;然后将写相关的base、ptr更新为_IO_buf_base。

注意最后的w-end会根据当前的模式选择是等于_IO_buf_base还是_IO_buf_end:

  • 如果fp->_mode <= 0,说明是标准字符,fp->_flags & (_IO_LINE_BUF | _IO_UNBUFFERED)说明是按行为buffer单位或没有缓存buffer,这种情况将写end置为_IO_buf_base,即无法使用buffer,否则则是可以使用buffer的情况,置为_IO_buf_end,可以使用base到end这块空间作为写缓存。

看一下col调整函数

1
2
3
4
5
6
7
8
9
10
unsigned
_IO_adjust_column (unsigned start, const char *line, int count)
{
const char *ptr = line + count;
while (ptr > line)
if (*--ptr == '\n')
return line + count - ptr - 1;
return start + count;
}
libc_hidden_def (_IO_adjust_column)

就是更新最后一行的列

  • 首先ptr指向真正写入的最后一个字符;

  • 当ptr大于line,即从后向前遍历字符,如果找到换行符,则结束,说明之前遍历的位于写入的最后一行,此时line + count - ptr - 1表示最后一行的字符数,返回该值即可;

  • 如果没有找到换行符,那就返回start + count,即之前的列号加真正写入的字符数。

最后在外层再加1得到当前行的列号,整体的逻辑就是要更新当前的列号。

read

1
2
3
4
5
6
7
8
ssize_t
_IO_file_read (FILE *fp, void *buf, ssize_t size)
{
return (__builtin_expect (fp->_flags2 & _IO_FLAGS2_NOTCANCEL, 0)
? __read_nocancel (fp->_fileno, buf, size)
: __read (fp->_fileno, buf, size));
}
libc_hidden_def (_IO_file_read)

就是调用系统调用

seek

1
2
3
4
5
6
off64_t
_IO_file_seek (FILE *fp, off64_t offset, int dir)
{
return __lseek64 (fp->_fileno, offset, dir);
}
libc_hidden_def (_IO_file_seek)

seek就是调用lseek

返回成功后的偏移地址如果错误返回-1

stat

1
2
3
4
5
6
int
_IO_file_stat (FILE *fp, void *st)
{
return __fxstat64 (_STAT_VER, fp->_fileno, (struct stat64 *) st);
}
libc_hidden_def (_IO_file_stat)

调用stat系统调用

返回一个stat结构体

underflow

underflow主要负责从文件中读取数据到缓冲区

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
int
_IO_new_file_underflow (FILE *fp)
{
ssize_t count;

/* C99 requires EOF to be "sticky". */
if (fp->_flags & _IO_EOF_SEEN)
return EOF;

if (fp->_flags & _IO_NO_READS)
{
fp->_flags |= _IO_ERR_SEEN;
__set_errno (EBADF);
return EOF;
}
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;

if (fp->_IO_buf_base == NULL)
{
/* Maybe we already have a push back pointer. */
if (fp->_IO_save_base != NULL)
{
free (fp->_IO_save_base);
fp->_flags &= ~_IO_IN_BACKUP;
}
_IO_doallocbuf (fp);
}

/* FIXME This can/should be moved to genops ?? */
if (fp->_flags & (_IO_LINE_BUF|_IO_UNBUFFERED))
{
/* We used to flush all line-buffered stream. This really isn't
required by any standard. My recollection is that
traditional Unix systems did this for stdout. stderr better
not be line buffered. So we do just that here
explicitly. --drepper */
_IO_acquire_lock (stdout);

if ((stdout->_flags & (_IO_LINKED | _IO_NO_WRITES | _IO_LINE_BUF))
== (_IO_LINKED | _IO_LINE_BUF))
_IO_OVERFLOW (stdout, EOF);

_IO_release_lock (stdout);
}

_IO_switch_to_get_mode (fp);

/* This is very tricky. We have to adjust those
pointers before we call _IO_SYSREAD () since
we may longjump () out while waiting for
input. Those pointers may be screwed up. H.J. */
fp->_IO_read_base = fp->_IO_read_ptr = fp->_IO_buf_base;
fp->_IO_read_end = fp->_IO_buf_base;
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end
= fp->_IO_buf_base;

count = _IO_SYSREAD (fp, fp->_IO_buf_base,
fp->_IO_buf_end - fp->_IO_buf_base);//注意读的大小是缓冲区的大小,这意味r-end可能会向后移动很多,从而导致读写不一致(_offset偏后)
if (count <= 0)
{
if (count == 0)
fp->_flags |= _IO_EOF_SEEN;
else
fp->_flags |= _IO_ERR_SEEN, count = 0;
}
fp->_IO_read_end += count;
if (count == 0)
{
/* If a stream is read to EOF, the calling application may switch active
handles. As a result, our offset cache would no longer be valid, so
unset it. */
fp->_offset = _IO_pos_BAD;
return EOF;
}
if (fp->_offset != _IO_pos_BAD)
_IO_pos_adjust (fp->_offset, count);
return *(unsigned char *) fp->_IO_read_ptr;
}
libc_hidden_ver (_IO_new_file_underflow, _IO_file_underflow)
  1. 如果已经到达文件末尾返回EOF

  2. 如果文件不允许读,设置错误并返回EOF

  3. 如果r-ptr<r-end返回r-ptr指向的字符

  4. 如果buf为空

    • 如果_IO_save_base不为空先将其释放,并取消_IO_IN_BACKUP标志位
    • 申请buf
  5. 如果是行缓冲或无缓冲模式

    • 给stdout上锁
    • 如果stdout是行缓冲且在_IO_list_all链上且不禁止写,对stdout调用overflow
    • 将stdout解锁
  6. 调用_IO_switch_to_get_mode (fp);

  7. 更新读写缓冲区所有指针为fp->_IO_buf_base

  8. 调用系统调用_IO_SYSREAD(fp, fp->_IO_buf_base,

    ​ fp->_IO_buf_end - fp->_IO_buf_base)

  9. ,返回值为count

  10. 如果count<=0

    • 如果count为0,文件标志设置到达末尾
    • 如果count小于0,设置错误标志,并将count置为0
  11. r-end向后移动count

  12. 如果count为0,将fp->_offset设置为-1(文件末尾)并返回EOF

  13. 如果fp->_offset不为-1,fp->_offset移动到当前位置向后count字节

  14. 返回r-ptr指向的字符

看以下其中调用的_IO_switch_to_get_mode (fp);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
int
_IO_switch_to_get_mode (FILE *fp)
{
if (fp->_IO_write_ptr > fp->_IO_write_base)
if (_IO_OVERFLOW (fp, EOF) == EOF)
return EOF;
if (_IO_in_backup (fp))
fp->_IO_read_base = fp->_IO_backup_base;
else
{
fp->_IO_read_base = fp->_IO_buf_base;
if (fp->_IO_write_ptr > fp->_IO_read_end)
fp->_IO_read_end = fp->_IO_write_ptr;
}
fp->_IO_read_ptr = fp->_IO_write_ptr;

fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end = fp->_IO_read_ptr;

fp->_flags &= ~_IO_CURRENTLY_PUTTING;
return 0;
}
libc_hidden_def (_IO_switch_to_get_mode)
  1. 如果w-ptr>w-base说明输出缓冲区还有数据尚未写入文件,调用_IO_OVERFLOW (fp, EOF)
  2. 如果处于备份模式fp->_IO_read_base = fp->_IO_backup_base;
    • 否则fp->_IO_read_base = fp->_IO_buf_base;
      • 如果w-ptr>r-end,r-end=w-ptr
  3. r-ptr被赋值为w-ptr,write的所有指针置为r-ptr
  4. 取消文件流的_IO_CURRENTLY_PUTTING标志位

感觉有些指针操作有些多余了,underflow外层中都会统一更新赋值

uflow

1
2
3
4
5
6
7
8
9
int
_IO_default_uflow (FILE *fp)
{
int ch = _IO_UNDERFLOW (fp);
if (ch == EOF)
return EOF;
return *(unsigned char *) fp->_IO_read_ptr++;
}
libc_hidden_def (_IO_default_uflow)
  1. 调用underflow
  2. 如果underflow返回值为EOF,返回EOF
  3. 否则返回fp->_IO_read_ptr处的字符

sync

sync负责平衡读写,将未写入的数据写入文件,将未读取的数据去除

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
int
_IO_new_file_sync (FILE *fp)
{
ssize_t delta;
int retval = 0;

/* char* ptr = cur_ptr(); */
if (fp->_IO_write_ptr > fp->_IO_write_base)
if (_IO_do_flush(fp)) return EOF;
delta = fp->_IO_read_ptr - fp->_IO_read_end;//读平衡,真正只读到ptr,但实际读到end
if (delta != 0)
{
off64_t new_pos = _IO_SYSSEEK (fp, delta, 1);//将多余的去除
if (new_pos != (off64_t) EOF)
fp->_IO_read_end = fp->_IO_read_ptr;//平衡
else if (errno == ESPIPE)
; /* Ignore error from unseekable devices. */
else
retval = EOF;
}
if (retval != EOF)
fp->_offset = _IO_pos_BAD;
/* FIXME: Cleanup - can this be shared? */
/* setg(base(), ptr, ptr); */
return retval;
}
libc_hidden_ver (_IO_new_file_sync, _IO_file_sync)
  1. 如果write缓冲区有未写入的,调用_IO_do_flush不成功写入或写入不完全则直接返回EOF
  2. 平衡read指针

imbue

在2.31中是个空函数

showmanyc

在2.31中是个空函数

close

关闭文件流

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
int
_IO_new_file_close_it (FILE *fp)
{
int write_status;
if (!_IO_file_is_open (fp))
return EOF;

if ((fp->_flags & _IO_NO_WRITES) == 0
&& (fp->_flags & _IO_CURRENTLY_PUTTING) != 0)
write_status = _IO_do_flush (fp);
else
write_status = 0;

_IO_unsave_markers (fp);

int close_status = ((fp->_flags2 & _IO_FLAGS2_NOCLOSE) == 0
? _IO_SYSCLOSE (fp) : 0);

/* Free buffer. */
if (fp->_mode > 0)
{
if (_IO_have_wbackup (fp))
_IO_free_wbackup_area (fp);
_IO_wsetb (fp, NULL, NULL, 0);
_IO_wsetg (fp, NULL, NULL, NULL);
_IO_wsetp (fp, NULL, NULL);
}
_IO_setb (fp, NULL, NULL, 0);
_IO_setg (fp, NULL, NULL, NULL);
_IO_setp (fp, NULL, NULL);

_IO_un_link ((struct _IO_FILE_plus *) fp);
fp->_flags = _IO_MAGIC|CLOSED_FILEBUF_FLAGS;
fp->_fileno = -1;
fp->_offset = _IO_pos_BAD;

return close_status ? close_status : write_status;
}
libc_hidden_ver (_IO_new_file_close_it, _IO_file_close_it)
  1. 如果文件不处于打开状态,直接返回
  2. 如果文件不禁止写且正处于写入模式,调用_IO_do_flush,否则设置write_status = 0
  3. 调用_IO_unsave_markers (fp);
  4. 如果文件未设置_IO_FLAGS2_NOCLOSE,调用sysclose关闭文件描述符
  5. 如果文件为宽字符模式,对宽字符缓冲进行处理
  6. 设置缓冲区指针为null
  7. 调用_IO_un_link解链文件
  8. 设置标志位,设置偏移基址为末尾设置文件描述符
  9. 返回close_status ? close_status : write_status;

doallocate

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
int
_IO_file_doallocate (FILE *fp)
{
size_t size;
char *p;
struct stat64 st;

size = BUFSIZ;
if (fp->_fileno >= 0 && __builtin_expect (_IO_SYSSTAT (fp, &st), 0) >= 0)
{
if (S_ISCHR (st.st_mode))
{
/* Possibly a tty. */
if (
#ifdef DEV_TTY_P
DEV_TTY_P (&st) ||
#endif
local_isatty (fp->_fileno))
fp->_flags |= _IO_LINE_BUF;
}
#if defined _STATBUF_ST_BLKSIZE
if (st.st_blksize > 0 && st.st_blksize < BUFSIZ)
size = st.st_blksize;
#endif
}
p = malloc (size);
if (__glibc_unlikely (p == NULL))
return EOF;
_IO_setb (fp, p, p + size, 1);
return 1;
}
libc_hidden_def (_IO_file_doallocate)
  1. 如果文件描述符大于等于0且文件返回的信息正常
    • 如果文件_IO_IS_FILEBUF标志被设置,设置 _IO_LINE_BUF标志位
  2. malloc申请chunk
  3. 设置buf指针并返回

seekpos

1
2
3
4
5
off64_t
_IO_default_seekpos (FILE *fp, off64_t pos, int mode)
{
return _IO_SEEKOFF (fp, pos, 0, mode);
}

调用_IO_SEEKOFF

seekoff

出现频率不高,暂时先略过

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
off64_t
_IO_new_file_seekoff (FILE *fp, off64_t offset, int dir, int mode)
{
off64_t result;
off64_t delta, new_offset;
long count;

/* Short-circuit into a separate function. We don't want to mix any
functionality and we don't want to touch anything inside the FILE
object. */
if (mode == 0)
return do_ftell (fp);

/* POSIX.1 8.2.3.7 says that after a call the fflush() the file
offset of the underlying file must be exact. */
int must_be_exact = (fp->_IO_read_base == fp->_IO_read_end
&& fp->_IO_write_base == fp->_IO_write_ptr);

bool was_writing = (fp->_IO_write_ptr > fp->_IO_write_base
|| _IO_in_put_mode (fp));

/* Flush unwritten characters.
(This may do an unneeded write if we seek within the buffer.
But to be able to switch to reading, we would need to set
egptr to pptr. That can't be done in the current design,
which assumes file_ptr() is eGptr. Anyway, since we probably
end up flushing when we close(), it doesn't make much difference.)
FIXME: simulate mem-mapped files. */
if (was_writing && _IO_switch_to_get_mode (fp))
return EOF;

if (fp->_IO_buf_base == NULL)
{
/* It could be that we already have a pushback buffer. */
if (fp->_IO_read_base != NULL)
{
free (fp->_IO_read_base);
fp->_flags &= ~_IO_IN_BACKUP;
}
_IO_doallocbuf (fp);
_IO_setp (fp, fp->_IO_buf_base, fp->_IO_buf_base);
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
}

switch (dir)
{
case _IO_seek_cur:
/* Adjust for read-ahead (bytes is buffer). */
offset -= fp->_IO_read_end - fp->_IO_read_ptr;

if (fp->_offset == _IO_pos_BAD)
goto dumb;
/* Make offset absolute, assuming current pointer is file_ptr(). */
offset += fp->_offset;
if (offset < 0)
{
__set_errno (EINVAL);
return EOF;
}

dir = _IO_seek_set;
break;
case _IO_seek_set:
break;
case _IO_seek_end:
{
struct stat64 st;
if (_IO_SYSSTAT (fp, &st) == 0 && S_ISREG (st.st_mode))
{
offset += st.st_size;
dir = _IO_seek_set;
}
else
goto dumb;
}
}

_IO_free_backup_area (fp);

/* At this point, dir==_IO_seek_set. */

/* If destination is within current buffer, optimize: */
if (fp->_offset != _IO_pos_BAD && fp->_IO_read_base != NULL
&& !_IO_in_backup (fp))
{
off64_t start_offset = (fp->_offset
- (fp->_IO_read_end - fp->_IO_buf_base));
if (offset >= start_offset && offset < fp->_offset)
{
_IO_setg (fp, fp->_IO_buf_base,
fp->_IO_buf_base + (offset - start_offset),
fp->_IO_read_end);
_IO_setp (fp, fp->_IO_buf_base, fp->_IO_buf_base);

_IO_mask_flags (fp, 0, _IO_EOF_SEEN);
goto resync;
}
}

if (fp->_flags & _IO_NO_READS)
goto dumb;

/* Try to seek to a block boundary, to improve kernel page management. */
new_offset = offset & ~(fp->_IO_buf_end - fp->_IO_buf_base - 1);
delta = offset - new_offset;
if (delta > fp->_IO_buf_end - fp->_IO_buf_base)
{
new_offset = offset;
delta = 0;
}
result = _IO_SYSSEEK (fp, new_offset, 0);
if (result < 0)
return EOF;
if (delta == 0)
count = 0;
else
{
count = _IO_SYSREAD (fp, fp->_IO_buf_base,
(must_be_exact
? delta : fp->_IO_buf_end - fp->_IO_buf_base));
if (count < delta)
{
/* We weren't allowed to read, but try to seek the remainder. */
offset = count == EOF ? delta : delta-count;
dir = _IO_seek_cur;
goto dumb;
}
}
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base + delta,
fp->_IO_buf_base + count);
_IO_setp (fp, fp->_IO_buf_base, fp->_IO_buf_base);
fp->_offset = result + count;
_IO_mask_flags (fp, 0, _IO_EOF_SEEN);
return offset;
dumb:

_IO_unsave_markers (fp);
result = _IO_SYSSEEK (fp, offset, dir);
if (result != EOF)
{
_IO_mask_flags (fp, 0, _IO_EOF_SEEN);
fp->_offset = result;
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
_IO_setp (fp, fp->_IO_buf_base, fp->_IO_buf_base);
}
return result;

resync:
/* We need to do it since it is possible that the file offset in
the kernel may be changed behind our back. It may happen when
we fopen a file and then do a fork. One process may access the
file and the kernel file offset will be changed. */
if (fp->_offset >= 0)
_IO_SYSSEEK (fp, fp->_offset, 0);

return offset;
}
libc_hidden_ver (_IO_new_file_seekoff, _IO_file_seekoff)

pbackfail

出现频率不高,暂时先略过

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
int
_IO_default_pbackfail (FILE *fp, int c)
{
if (fp->_IO_read_ptr > fp->_IO_read_base && !_IO_in_backup (fp)
&& (unsigned char) fp->_IO_read_ptr[-1] == c)
--fp->_IO_read_ptr;
else
{
/* Need to handle a filebuf in write mode (switch to read mode). FIXME!*/
if (!_IO_in_backup (fp))
{
/* We need to keep the invariant that the main get area
logically follows the backup area. */
if (fp->_IO_read_ptr > fp->_IO_read_base && _IO_have_backup (fp))
{
if (save_for_backup (fp, fp->_IO_read_ptr))
return EOF;
}
else if (!_IO_have_backup (fp))
{
/* No backup buffer: allocate one. */
/* Use nshort buffer, if unused? (probably not) FIXME */
int backup_size = 128;
char *bbuf = (char *) malloc (backup_size);
if (bbuf == NULL)
return EOF;
fp->_IO_save_base = bbuf;
fp->_IO_save_end = fp->_IO_save_base + backup_size;
fp->_IO_backup_base = fp->_IO_save_end;
}
fp->_IO_read_base = fp->_IO_read_ptr;
_IO_switch_to_backup_area (fp);
}
else if (fp->_IO_read_ptr <= fp->_IO_read_base)
{
/* Increase size of existing backup buffer. */
size_t new_size;
size_t old_size = fp->_IO_read_end - fp->_IO_read_base;
char *new_buf;
new_size = 2 * old_size;
new_buf = (char *) malloc (new_size);
if (new_buf == NULL)
return EOF;
memcpy (new_buf + (new_size - old_size), fp->_IO_read_base,
old_size);
free (fp->_IO_read_base);
_IO_setg (fp, new_buf, new_buf + (new_size - old_size),
new_buf + new_size);
fp->_IO_backup_base = fp->_IO_read_ptr;
}

*--fp->_IO_read_ptr = c;
}
return (unsigned char) c;
}
libc_hidden_def (_IO_default_pbackfail)

setbuf

1
2
3
4
5
6
7
8
9
10
11
12
13
FILE *
_IO_new_file_setbuf (FILE *fp, char *p, ssize_t len)
{
if (_IO_default_setbuf (fp, p, len) == NULL)
return NULL;

fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end
= fp->_IO_buf_base;
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);

return fp;
}
libc_hidden_ver (_IO_new_file_setbuf, _IO_file_setbuf)
  1. 调用_IO_default_setbuf (fp, p, len)
  2. 调整pptr和gptr为_IO_buf_base

看_IO_default_setbuf (fp, p, len)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
FILE *
_IO_default_setbuf (FILE *fp, char *p, ssize_t len)
{
if (_IO_SYNC (fp) == EOF)
return NULL;
if (p == NULL || len == 0)
{
fp->_flags |= _IO_UNBUFFERED;
_IO_setb (fp, fp->_shortbuf, fp->_shortbuf+1, 0);
}
else
{
fp->_flags &= ~_IO_UNBUFFERED;
_IO_setb (fp, p, p+len, 0);
}
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end = 0;
fp->_IO_read_base = fp->_IO_read_ptr = fp->_IO_read_end = 0;
return fp;
}
  1. sync平衡失败直接返回NULL
  2. 缓冲区未指定或长度为0,将文件流设置为无缓冲模式,并设置缓冲区指针为shorbuf
  3. 否则取消无缓冲标志并设置缓冲区为指定区域
  4. 更新pptr和gptr为null
  5. 返回fp

xsgetn

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
size_t
_IO_file_xsgetn (FILE *fp, void *data, size_t n)
{
size_t want, have;
ssize_t count;
char *s = data;

want = n;

if (fp->_IO_buf_base == NULL)
{
/* Maybe we already have a push back pointer. */
if (fp->_IO_save_base != NULL)
{
free (fp->_IO_save_base);
fp->_flags &= ~_IO_IN_BACKUP;
}
_IO_doallocbuf (fp);
}

while (want > 0)
{
have = fp->_IO_read_end - fp->_IO_read_ptr;
if (want <= have)
{
memcpy (s, fp->_IO_read_ptr, want);
fp->_IO_read_ptr += want;
want = 0;
}
else
{
if (have > 0)
{
s = __mempcpy (s, fp->_IO_read_ptr, have);
want -= have;
fp->_IO_read_ptr += have;
}

/* Check for backup and repeat */
if (_IO_in_backup (fp))
{
_IO_switch_to_main_get_area (fp);
continue;
}

/* If we now want less than a buffer, underflow and repeat
the copy. Otherwise, _IO_SYSREAD directly to
the user buffer. */
if (fp->_IO_buf_base
&& want < (size_t) (fp->_IO_buf_end - fp->_IO_buf_base))
{
if (__underflow (fp) == EOF)
break;

continue;
}

/* These must be set before the sysread as we might longjmp out
waiting for input. */
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
_IO_setp (fp, fp->_IO_buf_base, fp->_IO_buf_base);

/* Try to maintain alignment: read a whole number of blocks. */
count = want;
if (fp->_IO_buf_base)
{
size_t block_size = fp->_IO_buf_end - fp->_IO_buf_base;
if (block_size >= 128)
count -= want % block_size;
}

count = _IO_SYSREAD (fp, s, count);
if (count <= 0)
{
if (count == 0)
fp->_flags |= _IO_EOF_SEEN;
else
fp->_flags |= _IO_ERR_SEEN;

break;
}

s += count;
want -= count;
if (fp->_offset != _IO_pos_BAD)
_IO_pos_adjust (fp->_offset, count);
}
}

return n - want;
}
libc_hidden_def (_IO_file_xsgetn)
  1. 如果buf为null

    • 如果savebase不为null,先将其释放并取消’在备份’标志位
    • 调用doallocbuf申请缓冲区
  2. 循环,条件为当需要的数据want大于0

    • 如果want不多于read缓冲区中拥有的数据,直接将缓冲区中的数据转移到内存,并调整gptr

    • 否则如果want>have

      • 如果have大于0,先将have中的数据转移到内存
      • 如果处于备份模式,调用_IO_switch_to_main_get_area (fp);并结束当次循环
      • 如果buf不为空且want小于缓冲区容量,调用underflow,若返回EOF则跳出循环,否则结束此次循环
      • 如果缓冲区异常或者want大于缓冲区容量
        • 设置pptr和gptr
        • 如果缓冲区存在且缓冲区大于128则count -= want % block_size;,即将超过缓冲区的部分直接调用系统调用读取,剩余部分则在下一次循环完成
        • 调用系统调用read(fp, s, count),如果上一步没有修改count,那么这一步就可以直接调用系统调用read读取所有的内容,并且是直接读到目标区域不经过缓冲区,根据返回值有:
          • 如果返回为0,则设置文件标志为到达文件末尾,否则设置为发生错误,返回值小于0则设置错误标志位,两种情况下,都会跳出循环
          • 返回值大于0,则继续向下执行
        • s += count;want -= count;如果文件偏移不在末尾则调整offset
        • 再次开始循环,进行前面的操作
  1. 返回n-want(即读入的量)

xspuntn

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
size_t
_IO_new_file_xsputn (FILE *f, const void *data, size_t n)
{
const char *s = (const char *) data;
size_t to_do = n;
int must_flush = 0;
size_t count = 0;

if (n <= 0)
return 0;
/* This is an optimized implementation.
If the amount to be written straddles a block boundary
(or the filebuf is unbuffered), use sys_write directly. */

/* First figure out how much space is available in the buffer. */
if ((f->_flags & _IO_LINE_BUF) && (f->_flags & _IO_CURRENTLY_PUTTING))
{
count = f->_IO_buf_end - f->_IO_write_ptr;
if (count >= n)
{
const char *p;
for (p = s + n; p > s; )
{
if (*--p == '\n')
{
count = p - s + 1;
must_flush = 1;
break;
}
}
}
}
else if (f->_IO_write_end > f->_IO_write_ptr)
count = f->_IO_write_end - f->_IO_write_ptr; /* Space available. */

/* Then fill the buffer. */
if (count > 0)
{
if (count > to_do)
count = to_do;
f->_IO_write_ptr = __mempcpy (f->_IO_write_ptr, s, count);
s += count;
to_do -= count;
}
if (to_do + must_flush > 0)
{
size_t block_size, do_write;
/* Next flush the (full) buffer. */
if (_IO_OVERFLOW (f, EOF) == EOF)
/* If nothing else has to be written we must not signal the
caller that everything has been written. */
return to_do == 0 ? EOF : n - to_do;

/* Try to maintain alignment: write a whole number of blocks. */
block_size = f->_IO_buf_end - f->_IO_buf_base;
do_write = to_do - (block_size >= 128 ? to_do % block_size : 0);

if (do_write)
{
count = new_do_write (f, s, do_write);
to_do -= count;
if (count < do_write)
return n - to_do;//绕过缓冲区
}

/* Now write out the remainder. Normally, this will fit in the
buffer, but it's somewhat messier for line-buffered files,
so we let _IO_default_xsputn handle the general case. */
if (to_do)
to_do -= _IO_default_xsputn (f, s+do_write, to_do);
}
return n - to_do;
}
libc_hidden_ver (_IO_new_file_xsputn, _IO_file_xsputn)
  1. 如果n小于等于0直接返回
  2. 如果文件时行缓冲且正处于写入模式
    • 如果b-end—w-ptr大于n,从要写入数据的末尾开始查找’\n’符,如果找到了设置count为’\n’字符前的数据长度,并将must_flush 置 1
  3. 否则count=w_end - w_ptr
  4. 如果count大于0
    • 如果count>todo,count=todo
    • 将内存中的数据转移到缓冲区
    • s += count;to_do -= count;
  5. 如果to_do + must_flush > 0
    • 调用overflow,若返回EOF,则返回to_do == 0 ? EOF : n - to_do;
    • do_write = to_do - (block_size >= 128 ? to_do % block_size : 0);
    • 如果do_write大于0,调用new_do_write (f, s, do_write);写多余的
      • 如果写入数量小于do_write,返回n - to_do
    • 如果to_do还有剩,调用_IO_default_xsputn (f, s+do_write, to_do);

看_IO_default_xsputn (f, s+do_write, to_do);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
size_t
_IO_default_xsputn (FILE *f, const void *data, size_t n)
{
const char *s = (char *) data;
size_t more = n;
if (more <= 0)
return 0;
for (;;)
{
/* Space available. */
if (f->_IO_write_ptr < f->_IO_write_end)
{
size_t count = f->_IO_write_end - f->_IO_write_ptr;
if (count > more)
count = more;
if (count > 20)
{
f->_IO_write_ptr = __mempcpy (f->_IO_write_ptr, s, count);
s += count;
}
else if (count)
{
char *p = f->_IO_write_ptr;
ssize_t i;
for (i = count; --i >= 0; )
*p++ = *s++;
f->_IO_write_ptr = p;
}
more -= count;
}
if (more == 0 || _IO_OVERFLOW (f, (unsigned char) *s++) == EOF)
break;
more--;
}
return n - more;
}
libc_hidden_def (_IO_default_xsputn)
  1. 如果more小于等于0直接返回0
  2. 循环
    • 如果f->_IO_write_ptr < f->_IO_write_end
      • 如果count>more,count = more;
      • 如果count>20,将内存中的数据移动到缓冲区
      • 否则如果count不为0但小于20,将内存中的数据逐个复制到缓冲区
      • 如果more=0或者_IO_OVERFLOW (f, (unsigned char) *s++) == EOF结束循环,more!=0才会执行_IO_OVERFLOW,原因是因为前面已经填满了缓冲区需要刷新,其会单独往缓冲区写入一个字符(原本是处理行缓冲的机制),所以下面要more—
      • 否则more—
  3. 返回n-more

拾遗

  1. 综上可以看出缓冲区模式,对读取过程并没有什么影响,对写入过程的影响则要更大,不过都受到上层函数影响