Austin Group Minutes of the 4 November Teleconference Austin-226 Page 1 of 1
Submitted by Andrew Josey, The Open Group. November 5, 2004
Attendees
Andrew Josey, The Open Group
Nick Stoughton, USENIX, ISO/IEC OR
Don Cragun , Sun, PASC OR
Bruce Korb
Apologies
Ulrich Drepper, Red Hat
Mark Brown, IBM, TOG OR
Status of next plenary meeting
----------------
Andrew reported that it is possible that the PASC SEC will also meet
during the same week. This is to be confirmed.
On agenda items, we expect to spend at least one whole day
on the topic of utility syntax guidelines / option ordering.
Andrew will update the online agenda with the latest thoughts.
Other topics
------------
Nick reported that the C committee expect to produce a technical report
on some new string functions which have bounds checking and will be
looking for feedback. Andrew mentioned that the Base WG has a proposal
in a strawman draft to address this another way.
Defect Report Processing
-------------------------
The group picked up on the latest batch of defect reports,
which are available at the following URL:
http://www.opengroup.org/austin/aardvark/latest/
XBD ERN 23 signal.h SIGPOLL Accept as marked below
This is an Interpretation:
The standards states the requirements for SIGPOLL to be supported as
part of the XSI option, and conforming implementations must conform
to this. However, concerns have been raised about this which are being
referred to the sponsor.
Rationale:
The semantics of SIGPOLL are only specified with functionality
in the XSR option.
Notes to the Editor for a future revision (not part of this interpretation):
In XBD, signal.h
In the DESCRIPTION
Change from:
"[XSI] SIGPOLL T Pollable event."
To:
"[XSR] SIGPOLL T Pollable event."
and start XSI shading on the line below.
In the definition of siginfo_t XSR mark and shade the following:
"long si_band Band event for SIGPOLL. "
The following should be XSR marked and shaded in the table headed Signals/Codes:
" SIGPOLL POLL_IN Data input available.
POLL_OUT Output buffers available.
POLL_MSG Input message available.
POLL_ERR I/O error.
POLL_PRI High priority input available.
POLL_HUP Device disconnected. "
And in the table headed Signal/Member/Value XSR mark and shade:
"SIGPOLL long si_band Band event for POLL_IN, POLL_OUT, or POLL_MSG."
XBD ERN 24 POSIX advisory information Accept as marked below
No change required
This is presently covered in B.2.8 Realtime, Advisory Information
Advisory Information
POSIX.1b contains an Informative Annex with proposed interfaces for
"realtime files". These interfaces could determine groups of the exact
parameters required to do "direct I/O" or "extents". These interfaces
were objected to by a significant portion of the balloting group as
too complex. A conforming application had little chance of correctly
navigating the large parameter space to match its desires to the
system. In addition, they only applied to a new type of file (realtime
files) and they told the implementation exactly what to do as opposed
to advising the implementation on application behavior and letting it
optimize for the system the (portable) application was running on. For
example, it was not clear how a system that had a disk array should set
its parameters.
There seemed to be several overall goals:
* Optimizing sequential access
* Optimizing caching behavior
* Optimizing I/O data transfer
* Preallocation
The advisory interfaces, posix_fadvise() and posix_madvise(), satisfy
the first two goals. The POSIX_FADV_SEQUENTIAL and POSIX_MADV_SEQUENTIAL
advice tells the implementation to expect serial access. Typically
the system will prefetch the next several serial accesses in order
to overlap I/O. It may also free previously accessed serial data if
memory is tight. If the application is not doing serial access it
can use POSIX_FADV_WILLNEED and POSIX_MADV_WILLNEED to accomplish I/O
overlap, as required. When the application advises POSIX_FADV_RANDOM or
POSIX_MADV_RANDOM behavior, the implementation usually tries to fetch
a minimum amount of data with each request and it does not expect much
locality. POSIX_FADV_DONTNEED and POSIX_MADV_DONTNEED allow the system
to free up caching resources as the data will not be required in the
near future.
POSIX_FADV_NOREUSE tells the system that caching the specified data is
not optimal. For file I/O, the transfer should go directly to the user
buffer instead of being cached internally by the implementation. To
portably perform direct disk I/O on all systems, the application must
perform its I/O transfers according to the following rules:
1. The user buffer should be aligned according to the
{POSIX_REC_XFER_ALIGN} pathconf() variable.
2. The number of bytes transferred in an I/O operation should be a
multiple of the {POSIX_ALLOC_SIZE_MIN} pathconf() variable.
3. The offset into the file at the start of an I/O operation should
be a multiple of the {POSIX_ALLOC_SIZE_MIN} pathconf()
variable.
4. The application should ensure that all threads which open a
given file specify POSIX_FADV_NOREUSE to be sure that there is no
unexpected interaction between threads using buffered I/O and threads
using direct I/O to the same file.
In some cases, a user buffer must be properly aligned in order to be
transferred directly to/from the device. The {POSIX_REC_XFER_ALIGN}
pathconf() variable tells the application the proper alignment.
The preallocation goal is met by the space control function,
posix_fallocate(). The application can use posix_fallocate() to guarantee
no [ENOSPC] errors and to improve performance by prepaying any overhead
required for block allocation.
Implementations may use information conveyed by a previous posix_fadvise()
call to influence the manner in which allocation is performed. For
example, if an application did the following calls:
fd = open("file");
posix_fadvise(fd, offset, len, POSIX_FADV_SEQUENTIAL);
posix_fallocate(fd, len, size);
an implementation might allocate the file contiguously on disk.
Finally, the pathconf() variables {POSIX_REC_MIN_XFER_SIZE},
{POSIX_REC_MAX_XFER_SIZE}, and {POSIX_REC_INCR_XFER_SIZE} tell the
application a range of transfer sizes that are recommended for best
I/O performance.
Where bounded response time is required, the vendor can supply
the appropriate settings of the advisories to achieve a guaranteed
performance level.
The interfaces meet the goals while allowing applications using regular
files to take advantage of performance optimizations. The interfaces tell
the implementation expected application behavior which the implementation
can use to optimize performance on a particular system with a particular
dynamic load.
The posix_memalign() function was added to allow for the allocation of
specifically aligned buffers; for example, for {POSIX_REC_XFER_ALIGN}.
The working group also considered the alternative of adding a function
which would return an aligned pointer to memory within a user-supplied
buffer. This was not considered to be the best method, because it
potentially wastes large amounts of memory when buffers need to be
aligned on large alignment boundaries.
XCU ERN 24 cd relative paths Accept as marked below
This is an interpretation.
The standards states the requirements for the cd utility and its
handling of symbolic links, and conforming implementations must conform
to this. However, concerns have been raised about this which are being
referred to the sponsor.
Rationale:
A number of defects have been identified with how the cd utility
handles symbolic links.
Notes to the Editor (not part of the interpretation):
Replace step 6 with the following:
"6. If the -P option is in effect, set curpath to the directory
operand. Otherwise, set curpath to the string formed by the
concatenation of the value of PWD, a slash character, and the
operand."
Replace step 7 with the following:
"7. If the -P option is in effect, proceed to step 10.
If curpath does not begin with a slash character, set curpath
to the string formed by the concatenation of the value of PWD,
a slash character, and the operand."
(Note that most of the old step 7 text reappears in the new step 10
below.)
Replace step 8b with the following:
"b. For each dot-dot component, if there is a preceding component
and it is neither root nor dot-dot, then:
i. If the preceding component does not refer (in the
context of pathname resolution with symbolic links followed)
to a directory, then the cd utility shall display an
appropriate error message and no further steps shall be
taken.
ii. The preceding component, all slashes separating the
preceding component from dot-dot, dot-dot and all slashes
separating dot-dot from the following component (if any)
shall be deleted."
Insert a new step 9:
"9. If curpath is longer than {PATH_MAX} bytes (including the
terminating null) and the directory operand was not longer
than {PATH_MAX} bytes (including the terminating null), then
curpath shall be converted from an absolute pathname to an
equivalent relative pathname if possible. This conversion
shall always be considered possible if the value of PWD, with
a trailing slash added if it does not already have one, is an
initial substring of curpath. Whether or not it is
considered possible under other circumstances is unspecified.
Implementations may also apply this conversion if curpath is
not longer than {PATH_MAX} bytes or the directory operand was
longer than {PATH_MAX} bytes."
Replace the old step 9 with the following:
"10. The cd utility shall then perform actions equivalent to the
chdir() function called with curpath as the path argument.
If these actions fail for any reason, the cd utility shall
display an appropriate error message and the remainder of
this step shall not be executed. If the -P option is not in
effect, the PWD environment variable shall be set to the
value that curpath had on entry to step 9 (i.e. before
conversion to a relative pathname). If the -P option is in
effect, the PWD environment variable shall be set to an
absolute pathname for the current working directory and shall
not contain filename components that, in the context of
pathname resolution, refer to a file of type symbolic link.
If there is insufficient permission on the new directory, or
on any parent of that directory, to determine the current
working directory, the value of the PWD environment variable
is unspecified."
XCU ERN 36 what constitues a number for test and sh arithmetic expansion
Accept as marked below
This is an interpretation.
The standard does not speak to this issue of what constitutes a number in
XCU's test(1) and shell arithmetic expansion, and as such no conformance
distinction can be made between alternative implementations based on
this. This is being referred to the sponsor.
In the event that the primary operand to the primary operators
(-gt, -ge, -lt, -le, -eq, -ne) are not integers,
implementations are free to provide extensions that would recognize
those values or to treat them as errors.
The standard is unclear whether the integer arguments to the
six binary primaries are only decimal or if octal or hexadecimal
are recognized. Historically only decimal values have been recognized.
Notes to the editor for a future revision (not part of this interpretation):
In XCU test OPERANDS section on p909
Change "integers/integer" on lines 35256-35262 to "decimal integers/integer"
XCU ERN 46 uucp removal from specification Accept as marked below
This is not a defect in the current standard which reflects existing
known practise. It is agreed that this item should be placed into
SD/5 for consideration of whether to move it into an option
in the next revision. The review group noted that based on feedback to date
there appears to still be use and there are freely available implementations
XCU ERN 47 c99 -l operand Accept as marked
This is an Interpretation
The standard is unclear on this issue, and no conformance
distinction can be made between alternative implementations based
on this. This is being referred to the sponsor.
Notes to the Editor for a future revision (not part of this interpretation):
On line 8342 change:
"An operand is either in the form of a pathname or the form
-l library."
to:
"An operand is either in the form of a pathname or the form
-llibrary, or is one of two consecutive operands of the
form -l for the first and library for the second."
On line 8354 change:
"-l library (The letter ell.) Search the library named:"
to:
"-llibrary (A , the letter ell and a library name.)
-l library (Two consecutive operands, the first being a
and the letter ell; the second being
a library name.)
Search the library named:"
Add a new para before 8356 p213:
For the remainder of this description of the c99 utility,
both of the forms -l library and -llibrary are referred to as
as -l operand for brevity (even though the -l library form is
actually two operands).
After line 8359 add a new paragraph:
"If the last operand is a -l with no library name, then the c99
utility shall write a diagnostic message to standard error and
shall return a non-zero exit status."
XSH ERN 62 dbm_open Accept as marked below.
This is an Interpretation:
The standards states the requirements for the dbm_*
functions and their database implementation, and conforming implementations
must conform to this. However, concerns have been raised about this
which are being referred to the sponsor.
Rationale:
The current standard describes a specific implementation for storage
of a database excluding common existing practise which has evolved
yet remained compatible at the application programming interface.
Notes to the Editor for a future revision (not part of this interpretation):
In the DESCRIPTION
Change from:
"A datum consists of at least two members, dptr and dsize. The dptr member
points to an object that is dsize bytes in length. Arbitrary binary
data, as well as character strings, may be stored in the object pointed
to by dptr.
The database is stored in two files. One file is a directory containing
a bitmap of keys and has .dir as its suffix. The second file contains
all data and has .pag as its suffix.
The dbm_open() function shall open a database. The file argument to
the function is the pathname of the database. The function opens two
files named file.dir and file.pag. The open_flags argument has the same
meaning as the flags argument of open() except that a database opened
for write-only access opens the files for read and write access and the
behavior of the O_APPEND flag is unspecified. The file_mode argument
has the same meaning as the third argument of open()."
To:
"A datum consists of at least two members, dptr and dsize. The dptr member
points to an object that is dsize bytes in length. Arbitrary binary
data, as well as character strings, may be stored in the object pointed
to by dptr.
A database shall be stored in one or two files. When one file is used,
the name of the database file shall be formed by appending the suffix
".db" to the file argument given to dbm_open(). When two files are used,
the names of the database files shall be formed by appending the suffixes
".dir" and ".pag" respectively to the file argument.
The dbm_open() function shall open a database. The file argument to the
function is the pathname of the database. The open_flags argument has
the same meaning as the flags argument of open() except that a database
opened for write-only access opens the files for read and write access
and the behavior of the O_APPEND flag is unspecified. The file_mode
argument has the same meaning as the third argument of open().
The dbm_open() function need not accept pathnames longer than
{PATH_MAX}-4 bytes (including the terminating null), or pathnames
with a last component longer than {NAME_MAX}-4 bytes (excluding the
terminating null)."
Add to APPLICATION USAGE
Applications should take care that database pathname
arguments specified to dbm_open() are not prefixes of
unrelated files. This might be done, for example, by placing
databases in a separate directory.
Since some implementations use three characters for a suffix
and others use four characters for a suffix,
applications should ensure that the maximum portable pathname length
passed to dbm_open() is no greater than {PATH_MAX}-4 bytes, with the
last component of the pathname no greater than {NAME_MAX}-4 bytes.
Add to RATIONALE:
Previously the standard required the database to be stored in two files,
one file being a directory containing a bitmap of keys and havning ".dir"
as its suffix. The second file containing all data and having ".pag" as
its suffix. This has been changed not to specify the use of the files
and to allow newer implementations of the Berkeley DB interface using
a single file that have evolved while remaining compatible with the
application programming interface. The standard developers considered
removing the specific suffixes altogether but decided to retain them
so as not to pollute the application file namespace more than necessary
and to allow for portable backups of the database.
Next Steps
-----------
Andrew will update the aardvark reports with the latest inbound
defect reports.
There are a number of open action items outstanding:
1. Don Cragun Pathname Resolution proposal
2. Larry Dwyer system() and threads
3. Joerg Schilling wording for XCU ERN 1 pax
The next teleconference call is scheduled for Nov 11 2004