| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" |
| "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [ |
| <!ENTITY procfsexample SYSTEM "procfs_example.xml"> |
| ]> |
| |
| <book id="LKProcfsGuide"> |
| <bookinfo> |
| <title>Linux Kernel Procfs Guide</title> |
| |
| <authorgroup> |
| <author> |
| <firstname>Erik</firstname> |
| <othername>(J.A.K.)</othername> |
| <surname>Mouw</surname> |
| <affiliation> |
| <orgname>Delft University of Technology</orgname> |
| <orgdiv>Faculty of Information Technology and Systems</orgdiv> |
| <address> |
| <email>J.A.K.Mouw@its.tudelft.nl</email> |
| <pob>PO BOX 5031</pob> |
| <postcode>2600 GA</postcode> |
| <city>Delft</city> |
| <country>The Netherlands</country> |
| </address> |
| </affiliation> |
| </author> |
| </authorgroup> |
| |
| <revhistory> |
| <revision> |
| <revnumber>1.0 </revnumber> |
| <date>May 30, 2001</date> |
| <revremark>Initial revision posted to linux-kernel</revremark> |
| </revision> |
| <revision> |
| <revnumber>1.1 </revnumber> |
| <date>June 3, 2001</date> |
| <revremark>Revised after comments from linux-kernel</revremark> |
| </revision> |
| </revhistory> |
| |
| <copyright> |
| <year>2001</year> |
| <holder>Erik Mouw</holder> |
| </copyright> |
| |
| |
| <legalnotice> |
| <para> |
| This documentation is free software; you can redistribute it |
| and/or modify it under the terms of the GNU General Public |
| License as published by the Free Software Foundation; either |
| version 2 of the License, or (at your option) any later |
| version. |
| </para> |
| |
| <para> |
| This documentation is distributed in the hope that it will be |
| useful, but WITHOUT ANY WARRANTY; without even the implied |
| warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR |
| PURPOSE. See the GNU General Public License for more details. |
| </para> |
| |
| <para> |
| You should have received a copy of the GNU General Public |
| License along with this program; if not, write to the Free |
| Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, |
| MA 02111-1307 USA |
| </para> |
| |
| <para> |
| For more details see the file COPYING in the source |
| distribution of Linux. |
| </para> |
| </legalnotice> |
| </bookinfo> |
| |
| |
| |
| |
| <toc> |
| </toc> |
| |
| |
| |
| |
| <preface id="Preface"> |
| <title>Preface</title> |
| |
| <para> |
| This guide describes the use of the procfs file system from |
| within the Linux kernel. The idea to write this guide came up on |
| the #kernelnewbies IRC channel (see <ulink |
| url="http://www.kernelnewbies.org/">http://www.kernelnewbies.org/</ulink>), |
| when Jeff Garzik explained the use of procfs and forwarded me a |
| message Alexander Viro wrote to the linux-kernel mailing list. I |
| agreed to write it up nicely, so here it is. |
| </para> |
| |
| <para> |
| I'd like to thank Jeff Garzik |
| <email>jgarzik@pobox.com</email> and Alexander Viro |
| <email>viro@parcelfarce.linux.theplanet.co.uk</email> for their input, |
| Tim Waugh <email>twaugh@redhat.com</email> for his <ulink |
| url="http://people.redhat.com/twaugh/docbook/selfdocbook/">Selfdocbook</ulink>, |
| and Marc Joosen <email>marcj@historia.et.tudelft.nl</email> for |
| proofreading. |
| </para> |
| |
| <para> |
| This documentation was written while working on the LART |
| computing board (<ulink |
| url="http://www.lart.tudelft.nl/">http://www.lart.tudelft.nl/</ulink>), |
| which is sponsored by the Mobile Multi-media Communications |
| (<ulink |
| url="http://www.mmc.tudelft.nl/">http://www.mmc.tudelft.nl/</ulink>) |
| and Ubiquitous Communications (<ulink |
| url="http://www.ubicom.tudelft.nl/">http://www.ubicom.tudelft.nl/</ulink>) |
| projects. |
| </para> |
| |
| <para> |
| Erik |
| </para> |
| </preface> |
| |
| |
| |
| |
| <chapter id="intro"> |
| <title>Introduction</title> |
| |
| <para> |
| The <filename class="directory">/proc</filename> file system |
| (procfs) is a special file system in the linux kernel. It's a |
| virtual file system: it is not associated with a block device |
| but exists only in memory. The files in the procfs are there to |
| allow userland programs access to certain information from the |
| kernel (like process information in <filename |
| class="directory">/proc/[0-9]+/</filename>), but also for debug |
| purposes (like <filename>/proc/ksyms</filename>). |
| </para> |
| |
| <para> |
| This guide describes the use of the procfs file system from |
| within the Linux kernel. It starts by introducing all relevant |
| functions to manage the files within the file system. After that |
| it shows how to communicate with userland, and some tips and |
| tricks will be pointed out. Finally a complete example will be |
| shown. |
| </para> |
| |
| <para> |
| Note that the files in <filename |
| class="directory">/proc/sys</filename> are sysctl files: they |
| don't belong to procfs and are governed by a completely |
| different API described in the Kernel API book. |
| </para> |
| </chapter> |
| |
| |
| |
| |
| <chapter id="managing"> |
| <title>Managing procfs entries</title> |
| |
| <para> |
| This chapter describes the functions that various kernel |
| components use to populate the procfs with files, symlinks, |
| device nodes, and directories. |
| </para> |
| |
| <para> |
| A minor note before we start: if you want to use any of the |
| procfs functions, be sure to include the correct header file! |
| This should be one of the first lines in your code: |
| </para> |
| |
| <programlisting> |
| #include <linux/proc_fs.h> |
| </programlisting> |
| |
| |
| |
| |
| <sect1 id="regularfile"> |
| <title>Creating a regular file</title> |
| |
| <funcsynopsis> |
| <funcprototype> |
| <funcdef>struct proc_dir_entry* <function>create_proc_entry</function></funcdef> |
| <paramdef>const char* <parameter>name</parameter></paramdef> |
| <paramdef>mode_t <parameter>mode</parameter></paramdef> |
| <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef> |
| </funcprototype> |
| </funcsynopsis> |
| |
| <para> |
| This function creates a regular file with the name |
| <parameter>name</parameter>, file mode |
| <parameter>mode</parameter> in the directory |
| <parameter>parent</parameter>. To create a file in the root of |
| the procfs, use <constant>NULL</constant> as |
| <parameter>parent</parameter> parameter. When successful, the |
| function will return a pointer to the freshly created |
| <structname>struct proc_dir_entry</structname>; otherwise it |
| will return <constant>NULL</constant>. <xref |
| linkend="userland"/> describes how to do something useful with |
| regular files. |
| </para> |
| |
| <para> |
| Note that it is specifically supported that you can pass a |
| path that spans multiple directories. For example |
| <function>create_proc_entry</function>(<parameter>"drivers/via0/info"</parameter>) |
| will create the <filename class="directory">via0</filename> |
| directory if necessary, with standard |
| <constant>0755</constant> permissions. |
| </para> |
| |
| <para> |
| If you only want to be able to read the file, the function |
| <function>create_proc_read_entry</function> described in <xref |
| linkend="convenience"/> may be used to create and initialise |
| the procfs entry in one single call. |
| </para> |
| </sect1> |
| |
| |
| |
| |
| <sect1 id="Creating_a_symlink"> |
| <title>Creating a symlink</title> |
| |
| <funcsynopsis> |
| <funcprototype> |
| <funcdef>struct proc_dir_entry* |
| <function>proc_symlink</function></funcdef> <paramdef>const |
| char* <parameter>name</parameter></paramdef> |
| <paramdef>struct proc_dir_entry* |
| <parameter>parent</parameter></paramdef> <paramdef>const |
| char* <parameter>dest</parameter></paramdef> |
| </funcprototype> |
| </funcsynopsis> |
| |
| <para> |
| This creates a symlink in the procfs directory |
| <parameter>parent</parameter> that points from |
| <parameter>name</parameter> to |
| <parameter>dest</parameter>. This translates in userland to |
| <literal>ln -s</literal> <parameter>dest</parameter> |
| <parameter>name</parameter>. |
| </para> |
| </sect1> |
| |
| <sect1 id="Creating_a_directory"> |
| <title>Creating a directory</title> |
| |
| <funcsynopsis> |
| <funcprototype> |
| <funcdef>struct proc_dir_entry* <function>proc_mkdir</function></funcdef> |
| <paramdef>const char* <parameter>name</parameter></paramdef> |
| <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef> |
| </funcprototype> |
| </funcsynopsis> |
| |
| <para> |
| Create a directory <parameter>name</parameter> in the procfs |
| directory <parameter>parent</parameter>. |
| </para> |
| </sect1> |
| |
| |
| |
| |
| <sect1 id="Removing_an_entry"> |
| <title>Removing an entry</title> |
| |
| <funcsynopsis> |
| <funcprototype> |
| <funcdef>void <function>remove_proc_entry</function></funcdef> |
| <paramdef>const char* <parameter>name</parameter></paramdef> |
| <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef> |
| </funcprototype> |
| </funcsynopsis> |
| |
| <para> |
| Removes the entry <parameter>name</parameter> in the directory |
| <parameter>parent</parameter> from the procfs. Entries are |
| removed by their <emphasis>name</emphasis>, not by the |
| <structname>struct proc_dir_entry</structname> returned by the |
| various create functions. Note that this function doesn't |
| recursively remove entries. |
| </para> |
| |
| <para> |
| Be sure to free the <structfield>data</structfield> entry from |
| the <structname>struct proc_dir_entry</structname> before |
| <function>remove_proc_entry</function> is called (that is: if |
| there was some <structfield>data</structfield> allocated, of |
| course). See <xref linkend="usingdata"/> for more information |
| on using the <structfield>data</structfield> entry. |
| </para> |
| </sect1> |
| </chapter> |
| |
| |
| |
| |
| <chapter id="userland"> |
| <title>Communicating with userland</title> |
| |
| <para> |
| Instead of reading (or writing) information directly from |
| kernel memory, procfs works with <emphasis>call back |
| functions</emphasis> for files: functions that are called when |
| a specific file is being read or written. Such functions have |
| to be initialised after the procfs file is created by setting |
| the <structfield>read_proc</structfield> and/or |
| <structfield>write_proc</structfield> fields in the |
| <structname>struct proc_dir_entry*</structname> that the |
| function <function>create_proc_entry</function> returned: |
| </para> |
| |
| <programlisting> |
| struct proc_dir_entry* entry; |
| |
| entry->read_proc = read_proc_foo; |
| entry->write_proc = write_proc_foo; |
| </programlisting> |
| |
| <para> |
| If you only want to use a the |
| <structfield>read_proc</structfield>, the function |
| <function>create_proc_read_entry</function> described in <xref |
| linkend="convenience"/> may be used to create and initialise the |
| procfs entry in one single call. |
| </para> |
| |
| |
| |
| <sect1 id="Reading_data"> |
| <title>Reading data</title> |
| |
| <para> |
| The read function is a call back function that allows userland |
| processes to read data from the kernel. The read function |
| should have the following format: |
| </para> |
| |
| <funcsynopsis> |
| <funcprototype> |
| <funcdef>int <function>read_func</function></funcdef> |
| <paramdef>char* <parameter>buffer</parameter></paramdef> |
| <paramdef>char** <parameter>start</parameter></paramdef> |
| <paramdef>off_t <parameter>off</parameter></paramdef> |
| <paramdef>int <parameter>count</parameter></paramdef> |
| <paramdef>int* <parameter>peof</parameter></paramdef> |
| <paramdef>void* <parameter>data</parameter></paramdef> |
| </funcprototype> |
| </funcsynopsis> |
| |
| <para> |
| The read function should write its information into the |
| <parameter>buffer</parameter>, which will be exactly |
| <literal>PAGE_SIZE</literal> bytes long. |
| </para> |
| |
| <para> |
| The parameter |
| <parameter>peof</parameter> should be used to signal that the |
| end of the file has been reached by writing |
| <literal>1</literal> to the memory location |
| <parameter>peof</parameter> points to. |
| </para> |
| |
| <para> |
| The <parameter>data</parameter> |
| parameter can be used to create a single call back function for |
| several files, see <xref linkend="usingdata"/>. |
| </para> |
| |
| <para> |
| The rest of the parameters and the return value are described |
| by a comment in <filename>fs/proc/generic.c</filename> as follows: |
| </para> |
| |
| <blockquote> |
| <para> |
| You have three ways to return data: |
| </para> |
| <orderedlist> |
| <listitem> |
| <para> |
| Leave <literal>*start = NULL</literal>. (This is the default.) |
| Put the data of the requested offset at that |
| offset within the buffer. Return the number (<literal>n</literal>) |
| of bytes there are from the beginning of the |
| buffer up to the last byte of data. If the |
| number of supplied bytes (<literal>= n - offset</literal>) is |
| greater than zero and you didn't signal eof |
| and the reader is prepared to take more data |
| you will be called again with the requested |
| offset advanced by the number of bytes |
| absorbed. This interface is useful for files |
| no larger than the buffer. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Set <literal>*start</literal> to an unsigned long value less than |
| the buffer address but greater than zero. |
| Put the data of the requested offset at the |
| beginning of the buffer. Return the number of |
| bytes of data placed there. If this number is |
| greater than zero and you didn't signal eof |
| and the reader is prepared to take more data |
| you will be called again with the requested |
| offset advanced by <literal>*start</literal>. This interface is |
| useful when you have a large file consisting |
| of a series of blocks which you want to count |
| and return as wholes. |
| (Hack by Paul.Russell@rustcorp.com.au) |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Set <literal>*start</literal> to an address within the buffer. |
| Put the data of the requested offset at <literal>*start</literal>. |
| Return the number of bytes of data placed there. |
| If this number is greater than zero and you |
| didn't signal eof and the reader is prepared to |
| take more data you will be called again with the |
| requested offset advanced by the number of bytes |
| absorbed. |
| </para> |
| </listitem> |
| </orderedlist> |
| </blockquote> |
| |
| <para> |
| <xref linkend="example"/> shows how to use a read call back |
| function. |
| </para> |
| </sect1> |
| |
| |
| |
| |
| <sect1 id="Writing_data"> |
| <title>Writing data</title> |
| |
| <para> |
| The write call back function allows a userland process to write |
| data to the kernel, so it has some kind of control over the |
| kernel. The write function should have the following format: |
| </para> |
| |
| <funcsynopsis> |
| <funcprototype> |
| <funcdef>int <function>write_func</function></funcdef> |
| <paramdef>struct file* <parameter>file</parameter></paramdef> |
| <paramdef>const char* <parameter>buffer</parameter></paramdef> |
| <paramdef>unsigned long <parameter>count</parameter></paramdef> |
| <paramdef>void* <parameter>data</parameter></paramdef> |
| </funcprototype> |
| </funcsynopsis> |
| |
| <para> |
| The write function should read <parameter>count</parameter> |
| bytes at maximum from the <parameter>buffer</parameter>. Note |
| that the <parameter>buffer</parameter> doesn't live in the |
| kernel's memory space, so it should first be copied to kernel |
| space with <function>copy_from_user</function>. The |
| <parameter>file</parameter> parameter is usually |
| ignored. <xref linkend="usingdata"/> shows how to use the |
| <parameter>data</parameter> parameter. |
| </para> |
| |
| <para> |
| Again, <xref linkend="example"/> shows how to use this call back |
| function. |
| </para> |
| </sect1> |
| |
| |
| |
| |
| <sect1 id="usingdata"> |
| <title>A single call back for many files</title> |
| |
| <para> |
| When a large number of almost identical files is used, it's |
| quite inconvenient to use a separate call back function for |
| each file. A better approach is to have a single call back |
| function that distinguishes between the files by using the |
| <structfield>data</structfield> field in <structname>struct |
| proc_dir_entry</structname>. First of all, the |
| <structfield>data</structfield> field has to be initialised: |
| </para> |
| |
| <programlisting> |
| struct proc_dir_entry* entry; |
| struct my_file_data *file_data; |
| |
| file_data = kmalloc(sizeof(struct my_file_data), GFP_KERNEL); |
| entry->data = file_data; |
| </programlisting> |
| |
| <para> |
| The <structfield>data</structfield> field is a <type>void |
| *</type>, so it can be initialised with anything. |
| </para> |
| |
| <para> |
| Now that the <structfield>data</structfield> field is set, the |
| <function>read_proc</function> and |
| <function>write_proc</function> can use it to distinguish |
| between files because they get it passed into their |
| <parameter>data</parameter> parameter: |
| </para> |
| |
| <programlisting> |
| int foo_read_func(char *page, char **start, off_t off, |
| int count, int *eof, void *data) |
| { |
| int len; |
| |
| if(data == file_data) { |
| /* special case for this file */ |
| } else { |
| /* normal processing */ |
| } |
| |
| return len; |
| } |
| </programlisting> |
| |
| <para> |
| Be sure to free the <structfield>data</structfield> data field |
| when removing the procfs entry. |
| </para> |
| </sect1> |
| </chapter> |
| |
| |
| |
| |
| <chapter id="tips"> |
| <title>Tips and tricks</title> |
| |
| |
| |
| |
| <sect1 id="convenience"> |
| <title>Convenience functions</title> |
| |
| <funcsynopsis> |
| <funcprototype> |
| <funcdef>struct proc_dir_entry* <function>create_proc_read_entry</function></funcdef> |
| <paramdef>const char* <parameter>name</parameter></paramdef> |
| <paramdef>mode_t <parameter>mode</parameter></paramdef> |
| <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef> |
| <paramdef>read_proc_t* <parameter>read_proc</parameter></paramdef> |
| <paramdef>void* <parameter>data</parameter></paramdef> |
| </funcprototype> |
| </funcsynopsis> |
| |
| <para> |
| This function creates a regular file in exactly the same way |
| as <function>create_proc_entry</function> from <xref |
| linkend="regularfile"/> does, but also allows to set the read |
| function <parameter>read_proc</parameter> in one call. This |
| function can set the <parameter>data</parameter> as well, like |
| explained in <xref linkend="usingdata"/>. |
| </para> |
| </sect1> |
| |
| |
| |
| <sect1 id="Modules"> |
| <title>Modules</title> |
| |
| <para> |
| If procfs is being used from within a module, be sure to set |
| the <structfield>owner</structfield> field in the |
| <structname>struct proc_dir_entry</structname> to |
| <constant>THIS_MODULE</constant>. |
| </para> |
| |
| <programlisting> |
| struct proc_dir_entry* entry; |
| |
| entry->owner = THIS_MODULE; |
| </programlisting> |
| </sect1> |
| |
| |
| |
| |
| <sect1 id="Mode_and_ownership"> |
| <title>Mode and ownership</title> |
| |
| <para> |
| Sometimes it is useful to change the mode and/or ownership of |
| a procfs entry. Here is an example that shows how to achieve |
| that: |
| </para> |
| |
| <programlisting> |
| struct proc_dir_entry* entry; |
| |
| entry->mode = S_IWUSR |S_IRUSR | S_IRGRP | S_IROTH; |
| entry->uid = 0; |
| entry->gid = 100; |
| </programlisting> |
| |
| </sect1> |
| </chapter> |
| |
| |
| |
| |
| <chapter id="example"> |
| <title>Example</title> |
| |
| <!-- be careful with the example code: it shouldn't be wider than |
| approx. 60 columns, or otherwise it won't fit properly on a page |
| --> |
| |
| &procfsexample; |
| |
| </chapter> |
| </book> |