The journey of a packet through the linux 2.4 network stack
Harald Welte laforge@gnumonks.org
2000/09/13 14:18:22
This document describes the journey of a network packet inside the linux kernel 2.4.x. This has changed drastically since 2.2 because the globally serialized bottom half was abandoned in favor of the new softirq system.
I have to excuse for my ignorance, but this document has a strong focus on the "default case": x86 architecture and ip packets which get forwarded.
I am definitely no kernel guru and the information provided by this document may be wrong. So don't expect too much, I'll always appreciate Your comments and bugfixes
2.1 The receive interrupt
If the network card receives an ethernet frame which matches the local MAC address or is a linklayer broadcast, it issues an interrupt. The network driver for this particular card handles the interrupt, fetches the packet data via DMA / PIO / whatever into RAM. It then allocates a skb and calls a function of the protocol independent device support routines: net/core/dev.c:netif_rx(skb)
.
If the driver didn't already timestamp the skb, it is timestamped now. Afterwards the skb gets enqueued in the apropriate queue for the processor handling this packet. If the queue backlog is full the packet is dropped at this place. After enqueuing the skb the receive softinterrupt is marked for execution via include/linux/interrupt.h:__cpu_raise_softirq()
.
The interrupt handler exits and all interrupts are reenabled.
2.2 The network RX softirq
Now we encounter one of the big changes between 2.2 and 2.4: The whole network stack is no longer a bottom half, but a softirq. Softirqs have the major advantage, that they may run on more than one CPU simultaneously. bh's were guaranteed to run only on one CPU at a time.
Our network receive softirq is registered in net/core/dev.c:net_init()
using the function kernel/softirq.c:open_softirq()
provided by the softirq subsystem.
Further handling of our packet is done in the network receive softirq (NET_RX_SOFTIRQ) which is called from kernel/softirq.c:do_softirq()
. do_softirq() itself is called from three places within the kernel:
- from
arch/i386/kernel/irq.c:do_IRQ()
, which is the generic IRQ handler
- from
arch/i386/kernel/entry.S
in case the kernel just returned from a syscall
- inside the main process scheduler in
kernel/sched.c:schedule()
So if execution passes one of these points, do_softirq() is called, it detects the NET_RX_SOFTIRQ marked an calls net/core/dev.c:net_rx_action()
. Here the sbk is dequeued from this cpu's receive queue and afterwards handled to the apropriate packet handler. In case of IPv4 this is the IPv4 packet handler.
2.3 The IPv4 packet handler
The IP packet handler is registered via net/core/dev.c:dev_add_pack()
called from net/ipv4/ip_output.c:ip_init()
.
The IPv4 packet handling function is net/ipv4/ip_input.c:ip_rcv()
. After some initial checks (if the packet is for this host, ...) the ip checksum is calculated. Additional checks are done on the length and IP protocol version 4.
Every packet failing one of the sanity checks is dropped at this point.
If the packet passes the tests, we determine the size of the ip packet and trim the skb in case the transport medium has appended some padding.
Now it is the first time one of the netfilter hooks is called.
Netfilter provides an generict and abstract interface to the standard routing code. This is currently used for packet filtering, mangling, NAT and queuing packets to userspace. For further reference see my conference paper 'The netfilter subsystem in Linux 2.4' or one of Rustys unreliable guides, i.e the netfilter-hacking-guide.
After successful traversal the netfilter hook, net/ipv4/ipv_input.c:ip_rcv_finish()
is called.
Inside ip_rcv_finish(), the packet's destination is determined by calling the routing function net/ipv4/route.c:ip_route_input()
. Furthermore, if our IP packet has IP options, they are processed now. Depending on the routing decision made by net/ipv4/route.c:ip_route_input_slow()
, the journey of our packet continues in one of the following functions:
net/ipv4/ip_input.c:ip_local_deliver()
The packet's destination is local, we have to process the layer 4 protocol and pass it to an userspace process.
net/ipv4/ip_forward.c:ip_forward()
The packet's destination is not local, we have to forward it to another network
net/ipv4/route.c:ip_error()
An error occurred, we are unable to find an apropriate routing table entry for this packet.
net/ipv4/ipmr.c:ip_mr_input()
It is a Multicast packet and we have to do some multicast routing.
If the routing decided that this packet has to be forwarded to another device, the function net/ipv4/ip_forward.c:ip_forward()
is called.
The first task of this function is to check the ip header's TTL. If it is <= 1 we drop the packet and return an ICMP time exceeded message to the sender.
We check the header's tailroom if we have enough tailroom for the destination device's link layer header and expand the skb if neccessary.
Next the TTL is decremented by one.
If our new packet is bigger than the MTU of the destination device and the don't fragment bit in the IP header is set, we drop the packet and send a ICMP frag needed message to the sender.
Finally it is time to call another one of the netfilter hooks - this time it is the NF_IP_FORWARD hook.
Assuming that the netfilter hooks is returning a NF_ACCEPT verdict, the function net/ipv4/ip_forward.c:ip_forward_finish()
is the next step in our packet's journey.
ip_forward_finish() itself checks if we need to set any additional options in the IP header, and has ip_optFIXME doing this. Afterwards it calls include/net/ip.h:ip_send()
.
If we need some fragmentation, FIXME:ip_fragment gets called, otherwise we continue in net/ipv4/ip_forward:ip_finish_output()
.
ip_finish_output() again does nothing else than calling the netfilter postrouting hook NF_IP_POST_ROUTING and calling ip_finish_output2() on successful traversal of this hook.
ip_finish_output2() calls prepends the hardware (link layer) header to our skb and calls net/ipv4/ip_output.c:ip_output()
.
相关推荐
花火经典短文参考.doc
对于自然语言处理问题,短文本分类仍然是研究的热点,在特征稀疏,高维文本数据和特征表示方面存在明显问题。 为了直接表达文本,提出了一种简单而又新颖的变体,它采用单维度低维度。 本文提出了一种基于Densenet的...
经典英语短文背诵,让你四十二天学会英语。。。
不文明行为_经典短文.docx
微短文发布系统 动态版一个简单的短文发布系统,ASP动态版,欢迎ASP爱好者下载测试,共同交流提高。【微短文发布系统软件功能】 一、后台直接更新系统信息;二、数据库在线压缩、备份;三、友情链接管理; 四、...
1、输入一段100—200字的英文短文,存入一文件a中。 2、写函数统计短文出现的字母个数n及每个字母的出现次数 3、写函数以字母出现次数作权值,建Haffman树(n个叶子),给出每个字母的Haffman编码。 4、用每个字母...
(2) 将读到的数据拼接到字符串s中,最后执行find方法找出A的个数。 修改上面的程序,使之完成下列功能: (1)运行时在命令行提供文件名,Early-Precaution.txt,其内容见附录; (2)计算这篇短文的字符数(含空白...
【微短文发布系统基本介绍】 微短文发布系统 动态版一个简单的短文发布系统,ASP动态版,欢迎ASP爱好者下载测试,共同交流提高。 2012.9.27主要更新 模版 后台编辑器 去掉原始版本文章点评替换为多说直接...
中文短文本分析资料打包,包括中文分词,深度学习,文本发掘等方向的入门论文等
考研英语写作范文100篇之经典短文背诵.doc
经典励志短文.doc
详细介绍了中文短文本分类的方法,并给出了相关的模型及算法。
二十四篇英短文,二十四篇英短文,二十四篇英短文,二十四篇英短文。
【微短文发布系统基本介绍】 微短文发布系统 动态版一个简单的短文发布系统,ASP动态版,欢迎ASP爱好者下载测试,共同交流提高。 本版本为静态版本可以自己生成 【微短文发布系统软件功能】 一、后台直接...
高中英语短文改错PPT课件.pptx
阐述斯密核心理论的经典短文——铅笔是怎样造出来的.doc
经典英语短文背诵.doc
英语考研必备短文,英语阅读基础This is quite an extraordinary and meaningful picture. The old grandma, when alive, was fed poorly with a bowl of rice for every meal. In contrast, after her death, there ...