|
Timeout for TCP/IP Offload
Dan Yee, IBM TPF Development
Over the past few years, several TPF customers have requested
an option to time out send-type function calls that have been
issued to the Transmission Control Protocol/Internet Protocol
(TCP/IP) offload device. With
APAR PJ28568, that request
has become
a reality in the TPF base. The code for this APAR is now available
on the TPF Web site at
http://www.ibm.com/software/htp/tpf/pages/maint.htm.
What Are Send-Type Functions?
Many of our customers are running socket applications on their
TPF systems whereby two or more ECBs share a socket. With APAR
PJ26346 on program update tape (PUT) 12, socket functions of the
same type are serialized through enqc()
and deqc() function calls issued
by the TCP/IP offload code. Send-type functions, which consist
of the send(), sendto(),
write(), and writev()
functions, are one of a group of functions of the same type that
are serialized by TPF.
The send(), write(), and writev()
functions are issued for connected sockets, such as TCP sockets,
and sendto() functions are issued
for sockets that are not connected, such as User Datagram Protocol
(UDP) sockets. In the examples for this article, I use the send() function primarily because it
is the send-type function that is most commonly used; but the
timeout feature I am going to describe applies to the other send-type
functions as well.
A common characteristic of send-type functions is that they
are each considered blocked socket functions. A blocked
socket function is a function that does not always get back control
immediately after being issued. Other examples of blocked socket
functions are accept(), connect(), read(),
recv(), and recvfrom().
One reason why blocked functions do not always get back control
immediately is that they might have to wait for the remote system
to issue a corresponding connect(),
accept(), send(),
sendto(), or write()
function. For send-type functions, these functions may be blocked
for a socket if the remote system does not issue a corresponding
activate_on_receipt(),
read(), recv(), or recvfrom() function, and also if there
is no buffer space available to transmit the message they are
sending.
Socket application programmers have the option of changing
a socket to nonblocking mode so that all blocked functions get
back control immediately. However, more often than not, the primary
return code is -1 and the secondary return code is SOCWOULDBLOCK
(35), indicating that the function needs to be reissued to receive
a positive response. With TPF TCP/IP offload support, excessive
I/Os to and from the offload device can result when a socket is
in nonblocking mode, so most socket application programmers writing
programs for the offload device choose to keep the socket in blocking
mode to avoid the extra I/Os. For the examples in this article,
I am assuming that the application's socket is in blocking mode
because the new timeout option has no value if the socket is in
nonblocking mode.
Explaining the New Timeout Option
With TPF TCP/IP offload support, the send buffer is located
in the TCP/IP offload device, so a send()
issued to the device will be blocked if there is no buffer space
available in the device. TPF sets the timeout for send-type functions
to 0 because the offload device does not support the SO_SNDTIMEO
option for the setsockopt() function.
If TPF is allowed to time out the function when there is no available
buffer space in the offload device, the application could then
issue another send() request before
the offload device has transmitted the previous send()
request. As a result, there could be two or more outstanding send() function calls for the same
socket in the TCP/IP offload device, which is not supported.
In most circumstances, buffer space for a send()
request will be available, the offload device will transmit the
data, and TPF will return to the caller within a fraction of second.
However, there are instances in which no buffer space is available
to transmit the data, so the data cannot be sent immediately.
If the receive buffer of the remote system is full, the send() request from TPF is blocked
in the offload device. When this happens, the ECB that issued
the send() function is suspended
until the send() response from
the offload device is received. If the socket application consists
of multiple ECBs issuing a send()
for that same socket, those ECBs will also be suspended through
the enqc() function call, which
is issued by the TPF offload code. For send-type functions, TPF
sets the timeout for the enqc()
function call to 0, so those ECBs will be suspended indefinitely
until the send() response from
the offload device for the original send()
is received by TPF.
While the ECB that issued the original send()
function call waits for the response, the socket application could
create additional ECBs that issue a send()
and become blocked by the enqc()
function call issued by the TPF offload code. As more and more
ECBs are created, resources in the system become depleted and
the system can get into input list shutdown. In response to that
possibility, we have created a timeout option for send-type function
calls that are issued to the offload device to enable these calls
to time out within a time period specified by the socket application.
New Timeout Option Details
TPF TCP/IP native stack support currently supports the SO_SNDTIMEO
option for setsockopt(). With
APAR PJ28568, the TPF offload code now supports that option as
well. The implementation of that option is slightly different
from the way it is implemented in native stack support, so you
may need to take note of that in your applications.
Following is an example of the code you would need to issue
to time out a send() socket call
within 2 seconds:
#include <socket.h>
int client_socket, rc;
int sndtimeo = 2;
.
.
.
rc = setsockopt(client_sock, SOL_SOCKET, SO_SNDTIMEO,
(char *)&sndtimeo, sizeof(sndtimeo));
if (rc == -1)
{
printf("ABCD: Error in setsockopt - %d\n",sock_errno());
}
.
.
.
If a timeout occurs on a send()
socket call, the TPF offload code issues a deqc()
function to post other ECBs that may be waiting for the previous
send() to be completed for this
socket. At the same time, the posted ECB is notified that the
previous send() has timed out
so that it does not attempt to issue another send()
to the offload device. The ECB that has just been posted then
issues a deqc() function as well
to post any other ECB waiting to issue a send()
for that socket. The process continues until each ECB that had
been waiting for the original send()
to be completed is posted by a deqc().
For each ECB that is posted, TPF returns a primary return code
of -1 and secondary return code of SOCNOTSOCK (38).
In the meantime, TPF closes the socket internally on behalf
of the original ECB that issued the send().
Closing the socket internally enables other suspended ECBs waiting
for a response from the offload device to be posted so that the
application can exit those ECBs and free up additional resources.
It also prevents ECBs from issuing additional blocked function
calls to the offload device and potentially tying up resources
when they become blocked in the device. TPF returns to the original
ECB that issued the send() with
a primary return code of -1 and secondary return code of OFFLOADTIMEOUT
(1000). Because TPF closes the socket internally, it is not necessary
for the socket application to issue a close()
for that socket. For code written for TPF native stack support,
your socket application will get a SOCTIMEDOUT (60) secondary
return code on the send() and
must determine whether or not to close the socket.
If your program issues a send(),
it should be checking the return code to determine if the data
was sent. Following is an example of code that you could write
to handle a timeout from the offload device:
#include <socket.h>
int rc, client_sock, message_size;
char *send_client_message;
.
.
.
rc = send(client_sock, send_client_message,message_size, 0);
if (rc == -1 && sock_errno() == OFFLOADTIMEOUT)
{
printf("ABCD: Time-out on send - %d\n",sock_errno());
exit(0);
}
else
if (rc == -1)
{
printf("ABCD: send error - %d\n",sock_errno());
close(client_sock);
exit(0);
}
.
.
.
If the SO_SNDTIMEO option for setsockopt()
is not specified for a socket or if the timeout value is set to
0, the send(), sendto(),
write(), and writev()
functions block indefinitely until there is space available in
the send buffer to transmit the data.
For more information about the send timeout option for TCP/IP
offload support, see TPF Transmission Control Protocol/Internet
Protocol in the TPF 4.1 Product
Information Center.
|