DRC Runsets

This document will give a detailed introduction into the writing of DRC runsets. See also DRC Reference for a full reference of the DRC functions.

Runset basics

Runsets are basically Ruby scripts running in the context of a DRC runset interpreter. On that level, DRC runsets work with very few classes, specifically:

  • Layers ("DRC::DRCLayer" class): layers represent input from the original layout or are created by functions generating information. Layers can be used as input for other methods or methods can be called in layers.
  • Sources ("DRC::DRCSource" class): sources represent layout objects from where input is taken from. One default source is always provided - the default layout from where the data is taken from. More layout sources can be created to specify input from other layouts. Sources also carry information how to filter the input - for example cell filters or the working region (a rectangular region from which the input is taken).

Some functions are provided on global level and can be used without any object.

The basic elements of runsets are input and output specifications. Input is specified through "input" method calls. "input" will create a layer object that contains the shapes of specified layer. The results are output by calling the "output" method on a layer object with a specification where the output shall be sent to.

In general, the runset language is rich in alternatives - often there are more than one way to achieve the same result.

The script is executed in immediate mode. That is, each function will immediately be executed and the results of the operations can be used in conditional expressions and loops. Specifically it is possible to query whether a layer is empty and abort a loop or skip some block in that case.

Being Ruby scripts running in KLayout's scripting engine environment, runsets can make use of KLayout's full database access layer. It is possible to manipulate geometrical data on a per-shape basis. For that purpose, methods are provided to interface between the database access layer ("RBA::..." objects) and the DRC objects ("DRC::..." objects). Typically however it is faster and easier to work with the DRC objects and methods.

Input and output

Input is specified with the "input" method or global function. "input" is basically a method of a source object. There is always one source object which is the first layout loaded into the current view. Using "input" without and source object is calling that method on that default source object. As source is basically a collection of multiple layers and "input" will select one of them.

"input" will create a layer object representing the shapes of the specified input layer. There are multiple ways to specify the layer from which the input is taken. One of them is by GDS layer and datatype specification:

# GDS layer 17, datatype 0
l = input(17)

# GDS layer 17, datatype 10
l = input(17, 10)

# By expression (here: GDS layers 1-10, datatype 0 plus layer 21, datatype 10)
# All shapes are combined into one layer
l = input("1-10/0", "21/10")

Input can be obtained from other layouts than the default one. To do so, create a source object using the "layout" global function:

# layer 17 from second layout loaded
l = layout("@2").input(17)

# layer 100, datatype 1 and 2 from "other_layout.gds"
other_layout = layout("other_layout.gds")
l1 = other_layout.input(100, 1)
l2 = other_layout.input(100, 2)

Output is by default sent to the default layout - the first one loaded into the current view. The output specification includes the layer and datatype or the layer name:

# send output to the default layout: layer 17, datatype 0
l.output(17, 0)

# send output to the default layout: layer named "OUT"
l.output("OUT")

# send output to the default layout: layer 17, datatype 0, named "OUT"
l.output(17, 0, "OUT")

Output can be sent to other layouts using the "target" function:

# send output to the second layout loaded:
target("@2")

# send output to "out.gds", cell "OUT_TOP"
target("out.gds", "OUT_TOP")

Output can also be sent to a report database:

# send output to a report database with description "Output database"
# - after the runset has finished this database will be shown
report("Output database")

# send output to a report database saved to "drc.lyrdb"
report("Output database", "drc.lyrdb")

When output is sent to a report database, the specification must include a formal name and optionally a description. The output method will create a new category inside the report database and use the name and description for that:

# specify report database for output
report("The output database")
...
# Send data from layer l to new category "check1"
l.output("check1", "The first check")

The report and target specification must appear before the actual output statements. Multiple report and target specifications can be present sending output to various layouts or report databases. Note that each report or target specification will close the previous one. Using the same file name for subsequent reports will not append data to the file but rather overwrite the previous file.

Layers that have been created using "output" can be used for input again, but care should be taken to place the input statement after the output statement. Otherwise the results will be unpredictable.

Dimension specifications

Dimension specifications are used in many places: for coordinates, for spacing and width values and as length values. In all places, the following rules apply:

  • Floating-point numbers are interpreted as micron values by default.
  • Integer number are interpreted as database units by default (Not integer micron values!).
  • To make explicitly clear what dimensions to use, you can add a unit.

Units are added using the unit methods:

  • 0.1: 0.1 micrometer
  • 200: 200 database units
  • 200.dbu: 200 database units
  • 200.nm: 200 nm
  • 2.um or 2.micron: 2 micrometer
  • 0.2.mm: 0.2 millimeter
  • 1e-5.m: 1e-5 meter (10 micrometer)

Area units are usually square micrometers. You can use units as well to indicate an area value in some specific measurement units:

  • 0.1.um2 or 0.1.micron2: 0.1 square micron
  • 0.1.mm2: 0.1 square millimeter

Angles are always given in degree units. You can make clear that you want to use degree by adding the degree unit method:

  • 45.degree: 45 degree

Objects and methods

Runsets are basically scripts written in an object-oriented language. It is possible to write runsets that don't make much use of that fact, but having a notion of the underlying concepts will result in better understanding of the features and how to make full use of the capabilities.

In KLayout's DRC language, a layer is an object that provides a couple of methods. The boolean operations are methods, the DRC functions are methods and so on. Method are called "on" an object using the notation "object.method(arguments)". Many methods produce new layer objects and other methods can be called on those. The following code creates a sized version of the input layer and outputs it. Two method calls are involved: one sized call on the input layer returning a new layer object and one output call on that object.

input(1, 0).sized(0.1).output(100, 0)

The size method like other methods is available in two flavors: an in-place method and an out-of-place method. "sized" is out-of-place, meaning that the method will return a new object with the new content but not modify the object. The in-place version is "size" which modifies the object. Only the layer object is modified, not the original layer.

The following is the above code written with the in-place version:

layer = input(1, 0)
layer.size(0.1)
layout.output(100, 0)

Using the in-place versions is slightly more efficient in terms of memory since with the out-of-place version, KLayout will keep the unmodified copy as long as there is a chance it may be required. On the other hand the in-place version may cause strange side effects since because of the definition of the copy operation: a simple copy will just copy a reference to a layer object, not the object itself:

layer = input(1, 0)
layer2 = layer
layer.size(0.0)
layer.output(100, 0)
layer2.output(101, 0)

This code will produce the same sized output for layer 100 and 101, because the copy operation "layer2 = layer" will not copy the content but just a reference: after sizing "layer", "layer2" will also point to that sized layer.

That problem can be solved by either using the out-of-place version or by creating a deep copy with the "dup" function:

# out-of-place size:
layer = input(1, 0)
layer2 = layer
layer = layer.sized(0.0)
layer.output(100, 0)
layer2.output(101, 0)

# deep copy before size:
layer = input(1, 0)
layer2 = layer.dup
layer.size(0.0)
layer.output(100, 0)
layer2.output(101, 0)

Some methods are provided in different flavors including function-style calls. For example the width check can be written in two ways:

# method style:
layer.width(0.2).output("width violations")

# function style:
w = width(layer, 0.2)
output(w, "width violations")

The function style is intended for users not familiar with the object-oriented style who prefer a function notation.

Here is a brief overview over some of the methods available:

Edge and polygon layers

KLayout knows two basic layer types: polygon and edge layers. Input from layout is always of polygon type initially.

Polygon layers describe objects having an area ("filled objects" in the drawing view). Such objects can be processed with boolean operations, sized, decomposed into holes and hull, filtered by area and perimeter and so on. DRC methods such as width and spacing checks can be applied to polygons in a different way than between different polygons (see space, separation and notch for example).

Polygons can be raw or merged. Merged polygons consist of a hull contour and zero to many hole contours inside the hull. Merging can be ensured by putting a layer into "clean" mode (see clean, clean mode is the default). Raw polygons usually don't have such a representation and consist of a single contour folding inside to form the holes. Raw polygons are formed in "raw" mode (see raw).

Egde layers can be derived from polygon layers and allow the description is individual edges ("sides") of a polygon. Edge layers offer DRC functions similar for polygons but in a slightly different fashion - edges are checked individually, non considering the polygons they belong to. Neither do other parts of the polygons shield interactions, hence the results may be different.

Edges can be filtered by length and angle. extended allows erecting polygons (typically rectangles) on the edges. Edge layers are useful to perform operations on specific parts of polygons, for example width or space checks confined to certain edge lengths.

Edges do not differentiate whether they originate from holes or hulls of the polygon. The direction of edges is always following a certain convention: when looking from the start to the end point of an edge, the "inside" of the polygons from which the edges were derived, is to the right. In other words: the edges run along the hull in clockwise direction and counterclockwise along the holes.

Merged edges are joined, i.e. collinear edges are merged into single edges and degenerate edges (single-point edges are removed). Merged edges are present in "clean" mode (see clean, clean mode is the default).

Polygons can be decomposed into edges with the edges method. Another way to generate edges is to take edges from edge pair objects which are generated by the DRC check functions.

Edge pairs and edge pair collections

Edge pairs are objects consisting of two edges. Edge pairs are handy when discribing a DRC check violation, because a violation occurs between two edges. The edge pair generated for such a violation consists of the parts of both edges violation the condition. For two-layer checks, the edges originate from the original layers - edge 1 is related to input 1 and edge 2 is related to input 2.

Edge pair collections act like normal layers, but very few methods are defined for those. Edge pairs can be decomposed into single edges (see edges, first_edges and second_edges).

Edge pairs can be converted to polygons using polygons. Edge pairs can have a vanishing area, for example if both edges are coincident. In order to handle such edge pairs properly, an enlargement can be applied optionally. With such an enlargement, the polygon will cover a region bigger than the original edge pair by the given enlargement.

Raw and clean layer mode

KLayout's DRC engine supports two basic ways to interpret geometrical information on a layer: in clean mode, polygons or edges are joined if they touch. If regions are drawn in separate pieces they are effectively joined before they are used. In raw mode, every polygon or shape on the input layer is considered a separate part. There are applications for both ways of looking at a set of input shapes, and KLayout supports both ways.

Clean mode is the default - every layer generated or taken from input will be used in clean mode. To switch to raw mode, use the "raw" method. "raw mode" is basically a flag set on the layer object which instructs the engine not the merge polygons prior to use. The raw mode flag can be reset with the "clean" method.

Most functions implicitly merge polygons and edges in clean mode. In the documentation this fact is referred to as "merged semantics": if merged semantics applies for the function, coherent polygons or edges are considered one object in clean mode. In raw mode, every polygon or edge is treated as an individual object.

One application is the detection of overlapping areas after a size step:

overlaps = layer.size(0.2).raw.merged(2)

That statement has the following effect:

The "merged" method with an argument of 2 will produce output where more than two polygons overlap. The size function by default creates a clean layer, but separate polygons for each input polygon, so by using "raw", the layer is switched into raw mode that makes the individual polygons accessible without merging them into one bigger polygon.

Please note that the raw or clean methods modify the state of a layer so beware of the following pitfall:

  layer = input(1, 0)
  layer.raw.sized(0.1).output(100, 0)

  # this check will now be done on a raw layer, since the 
  # previous raw call was putting the layer into raw mode
  layer.width(0.2).ouput(101, 0)

The following two images show the effect of raw and clean mode:

Logging and verbosity

While the runset is executed, a log is written that lists the methods and their execution times. The log is enabled using the verbose function. The log and info functions allows enterin additional information into the log. "info" will enter the message if verbose mode is enabled. "log" will enter the message always. silent is equivalent to "verbose(false)".

The log is shown in the log window or - if the log window is not open - on the terminal on Linux-like systems.

The log function is useful to print result counts during processing of the runset:

  ...
  drc_w = input(1, 0).width(0.2)
  log("Number of width violations: #{drc_w.size}")
  ...
  

The error function can be used to output error messages unconditionally, formatted as an error. The log can be sent to a file instead of the log window or terminal output, using the log_file function:

  log_file("drc_log.txt")
  verbose(true)
  info("This message will be sent to the log file")
  ...
  

The tiling option

Tiling is a method to reduce the memory requirements for an operation. For big layouts, pulling a whole layer into the engine is not a good idea - huge layouts will require a lot of memory. The tiling method cuts the layout into tiles with a given width and height and processes them individually. The tiling implementation of KLayout can make use of multiple CPU cores by distributing the jobs on different cores.

Tiling does not come for free: some operations have a potentially infinite range. For example, selecting edges by their length in clean mode basically requires to collect all pieces of the edge before the full length can be computed. An edge running over a long length however may cross multiple tiles, so that the pieces within one tile don't sum up to the correct length.

Fortunately, many operations don't have an infinite range, so that tiling can be applied successfully. These are the boolean operations, sizing and DRC functions. For those operations, a border is added to the tile which extends the region inside which the shapes are collected. That way, all shapes potentially participating in an operation are collected. After performing the operation, polygons and edges extending beyond the tile's original boundary are clipped. Edge pairs are retained if they touch or overlap the original tile's border. That preserves the outline of the edge pairs, but may render redundant markers in the tile's border region.

For non-local operations such as the edge length example, a finite range can be deduced in some cases. For example, if small edges are supposed to be selected, the range of the operation is limited: longer edges don't contribute to the output, so it does not matter whether to take into account potential extensions of the edge in neighboring tiles. Hence, the range is limited and a tile border can be given.

To enable tiling use the tiles function. The threads function specifies the number of CPU cores to use in tiling mode. flat will disable tiling mode:

# Use a tile size of 1mm
tiles(1.mm)
# Use 4 CPU cores
threads(4)

... tiled operations ...

# Disable tiling
flat

... non-tiled operations ...

Some operations implicitly specify a tile border. If the tile border is known (see length example above), explicit borders can be set with the tile_borders function. no_borders will reset the borders (the implicit borders will still be in place):

# Use a tile border of 10 micron:
tile_borders(10.um)

... tile operations with a 10 micron border ...

# Disable the border
no_borders

A word about the tile size: typically tile dimensions in the order of millimeters is sufficient. Leading-edge technologies may require smaller tiles. The tile border should not be bigger than a few percent of the tile's dimension to reduce the redundant tile overlap region. In general using tiles is a compromise between safe function and performance. Very small tiles imply some performance overhead do to shape collection and potentially clipping. In addition, the clipping at the tile's borders may introduce artificial polygon nodes and related snapping to the database unit grid. That may not be desired in some applications requiring a high structure fidelity. Hence, small tiles should be avoided in that sense too.