Google Code Search

  Google Code Search packagemap file definition


Terms of Use

Discussion Group

Google Labs

Contents

Overview
Syntax

Overview [Contents]

Google Code Search enables users to search the web for archives containing source code. Our software locates source code files within those archives, and detects the language and license. Just as you can use a regular Sitemap to give us information about the pages on your site, you can use a packagemap file to tell us the language and license of the source code in your archive files.

Syntax [Contents]

The code is in XML. Here is an example:

<?xml version="1.0" encoding="UTF-8"?>
<fileset>
<file>
   <path>source/myfile.cpp</path>
   <type>C++</type>
   <license>LGPL</license>
</file>

<file>
   <path>messages/messages.tgz</path>
   <type>archive</type>
   <license>BSD</license>
   <packagemap>info/PackageMap.xml</packagemap>
</file>
</fileset>

File names

In a Code Search Sitemap, specify the name of the packagemap with the <packagemap> tag. If you don't specify the packagemap file, we will check the top directory in the archive for the following files, and use the first one that is found:

  • PACKAGEMAP.XML
  • PACKAGEMAP.xml
  • Packagemap.xml
  • packagemap.xml
  • PACKAGEMAP
  • Packagemap
  • packagemap

XML tag definitions

The available XML tags are described below.

<fileset>
required Encapsulates the file and references the current protocol standard.
<file>
required Child of <fileset>
<path>
required Child of <file>. Describes the file path within the archive. Case sensitive; can contain any characters.
<type>
required

Child of <file>. Value can be a language name or "archive". Examples for the language name include: "C", "Python", "C#", "Java", "Vim".

Case is ignored; "Java", "JAVA" and "java" are equivalent.

The value must be printable ASCII characters, no white space.

The name must be one of the supported languages.

We only index files with a supported language. All other files will be ignored. You can use a language name that we do not support yet, and we may index the file in the future.

The special value "archive" can be used for an archive inside an archive. This is only useful if this archive contains source code.

Because Code Search indexes only source code, there is no need to add an entry for any archive containing only text, html, etc.

<license>
optional

Child of <file>. Value should be the name of the software license. Examples include: "GPL", "BSD", "Python", "disclaimer".

Case is ignored; "LPGL", "Lgpl" and "lgpl" are equivalent.

When <type> is "archive" the value of <license> is the default license for the files in the archive. A different license can be specified for specific files with a packagemap in the archive.

The license must be one of the supported licenses. For unrecognized licenses, the license value will be listed as "unknown".

<packagemap>
optional

Child of <file>. The name of the packagemap file inside the archive. We recommend "PACKAGEMAP.xml". In this case, we will automatically detect the packagemap file, so you do not need to include here.

Case sensitive.

This tag can be used only for <file> entries where the value of <type> is "archive".

Entity escaping

Leading and trailing white space is ignored. UTF-8 encoding is mandatory. As with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the table below.

Character Escape Code
Ampersand & &amp;
Single Quote ' &apos;
Double Quote " &quot;
Greater Than > &gt;
Less Than < &lt;


Google Home - Google Labs - Discuss - Terms of Service - Help - Submit Your CodeNew!

©2011 Google