Working with Filters

Filters allow you to quickly and interactively focus on a particular aspect or subset of your process. In this chapter, we do not look at the individual filter types that are available in Disco (refer to Filtering for a detailed descriptions of the filters in Disco) but where to find them and how to work with them in practice.

You can get to the filters from multiple places in Disco, making it easy for you to modify or inspect your current filters regardless of where you are in the application.

You find the filter symbol in the following places in Disco:

Project view
Pressing the filter symbol on the lower left in the project view - see Figure 1 - brings up the filter settings for the currently selected data set.
Map view, Statistics view, Case view
You can access the filter settings from any of the three analysis views by clicking on the filter symbol in the lower left corner - see Figure 2.
../_images/OpenFilter-a.png

Figure 1: Pressing the filter symbol in the project view brings up the filter settings for the currently selected data set.

../_images/OpenFilter-b.png

Figure 2: The filter settings can be accessed from any of the three analysis views: Map view, Statistics view, and Case view.

If you click on the filter symbol in any of these places, this will bring up the filter settings for the current data set. In Figure 3 you can see that the empty filter settings contain four main areas, numbered 1-4.

../_images/FilterDialog.png

Figure 3: The filter settings dialog in Disco.

These four areas are:

List of active filters (1)
In Figure 3 the list of filters is still empty. Learn how to add filters in Adding Filters and Managing the Filter Stack. Also take a look at the Filter Short-cuts from the Map view and from the Statistics view.
Controls (2)
After building up or changing your filter list, you can apply the new filter settings. Learn what it means to apply the filter settings in Applying Filters.
Recommended filters (3)
You can learn more about Disco’s filter recommendations in Filter Recommendations.
Recipes (4)
Learn how to store, re-use and export filter combinations in Recipes: Saving, Sharing, and Re-using Filter Combinations.

Adding Filters and Managing the Filter Stack

When you want to add a filter to your data set, you can click on the click to add filter… area in the filter list as shown in Figure 4. A list of six different filter types appears in the next step. Move your mouse pointer along this list to choose the type of filter that you would like to add.

../_images/AddFilter.png

Figure 4: Add new filters by clicking on the click to add filter… area in your filter list. Then, choose the type of filter that you would like to add.

After you have chosen a filter type, an instance of this filter will be placed in your filter list. You can either stop there and work with the single filter, or you can add more filters. Figure 5 shows an example, where four filters have been added to the data set: A Timeframe Filter, an Endpoints Filter, and two Attribute Filter. As you can see, multiple filters of the same type can be added to the filter stack. To learn more about how multiple filters work together, refer to the section on Combining Filters.

../_images/FilterStack.png

Figure 5: Managing the filter stack in Disco.

Figure 5 also shows you an inventory of the controls (numbered 1-6) that you can use to manage the filter stack in Disco. The following controls are available:

Settings area of the currently selected filter (1)

The configuration of the currently selected filter is shown in this area. For example, in Figure 5 you can see the current configuration of the first Attribute Filter. The settings for the different filter types are explained in detail in the following sections:

Note that filter settings that you see in this area are always shown in reference to the underlying data set (not in reference to the outcome of the previous filter in the filter stack). One of the reasons is that in this view you are configuring the filter settings, but the filters are only actually applied to your data set once you press the Apply filter or Copy and filter button (see Applying Filters).

Filtering can take time and Disco is designed to also work well for very large data sets. So, if you have a data set with millions or billions of records, you can still freely configure your filters without any delay, and only once you hit the Apply filter or Copy and filter button Disco goes through the full, raw data underneath and filters it right in that moment. Refer to Configuring Filters Based on the Outcome of Previous Filters if you need to configure a filter based on the outcome of other filters.

Move currently selected filter up in the list (2)
If you want to move the currently selected filter up in your list, you can press the little upwards-pointing arrow button. Learn more about how changing the order of your filters in the list affects the outcome in Applying Filters and Combining Filters.
Move currently selected filter down in the list (3)
Press the downwards-pointing arrow button to move the currently selected filter down in your list. Learn more about how changing the order of your filters in the list affects the outcome in Applying Filters and Combining Filters.
Remove currently selected filter from the list (4)
If you want to remove the currently selected filter from your list of active filters, you can press the small x symbol.
Clear all filters from the list (5)
Use this large X Reset symbol at the bottom of your filter list to clear your complete filter stack (removes all filters at once).
Restore the filter list (6)
If you changed the settings of a filter but then changed your mind about it, or if you accidentally removed one of the filters from your list, you always have the option to restore the filter list as it was when you entered the filter dialog by pressing the Undo Changes symbol. [1] This way, you can safely edit and change things, while still having the option to go back if something goes wrong.

Read on to learn how to automatically add filters via shortcuts, how you can apply the current filter settings to your data set, what happens when you apply them, how you can apply filters to a copy of your data set to preserve the original data, and how you can re-use and save your filter settings through the recipe functionality in Disco.

Filter Short-cuts

In addition to directly adding filters, you can also use one of the many filter short-cuts in Disco. Filter short-cuts are great, because they allow you to add a filter directly from your particular analysis context - without having to remember and finding your way to the right filter (and saving you time).

There are four filter short-cuts available right from the process map:

  • Filter this path (see Filtering Paths from the Process Map): When you click on a path in your process map, you can use the shortcut “Filter this path” to add a pre-configured Follower Filter in directly followed mode, which keeps only cases that have passed through that particular path in the process.
  • Filter this activity (see Filtering Activities from the Process Map): When you click on an activity in your process map, you can use the shortcut “Filter this activity” to add a pre-configured Attribute Filter in Mandatory mode, which will keep only cases that at some point in time have performed this particular process step.
  • Filter this start activity (see Filtering Start and End Activities from the Process Map): When you click on one of the green, dashed lines that indicate a start point in your process map, you can use the shortcut “Filter this start activity” to add a pre-configured Endpoints Filter, which will only keep cases that have had this particular activity as the very first step in the process.
  • Filter this end activity (see Filtering Start and End Activities from the Process Map): When you click on one of the reddish, dashed lines that indicate an end point in your process map, you can use the shortcut “Filter this end activity” to add a pre-configured Endpoints Filter, which will only keep cases that have had this particular activity as the very last step in the process.

These short-cuts are very powerful for the particular use cases they address: Being able to focus on cases that pass through a particular part of your process. In addition, once you become a bit more familiar with the filters in Disco, you will see that the short-cuts are often also the fastest way to get to slightly adjusted, or even opposite filter configurations than the ones that the short-cuts provide (see the video in Figure 6 for an example).

../_images/FirstTimeRight-ShortCutScenario.png

Figure 6: Illustration of how you can leverage the filter short-cuts from the process map in Disco to quickly work yourself towards your goal in an iterative and visual manner (watch the video at https://youtu.be/pmXZQhFSv10 by clicking on the image above).

Next to the short-cuts from the process map, there are also filter short-cuts available from the statistics view (see Filtering Cases, Variants, Activities, Resources, and Attributes for screenshots about how these filter short-cuts work):

  • Filter this case: When you right-click on a case in the Case statistics table you get the option to, for example, “Filter for case ‘Case1254’”, which adds a pre-configured Attribute Filter with the Cases attribute and the case Case1254 selected.
  • Filter this variant: When you right-click on a case in the Variant statistics table you get the option to, for example, “Filter for ‘Variant 3’”, which adds a pre-configured Attribute Filter with the Variants attribute and the variant Variant 3 selected.
  • Filter this attribute (or activity or resource): When you right-click on an attribute value (or an activity or resource name) in the attribute statistics, you can add a pre-configured Attribute Filter with the corresponding attribute and the chosen attribute value selected in Mandatory mode.

All of these short-cuts can provide you with significant time-savings. Once you have them in your fingers (it’s worth to practice!) you will be even more confident and able to very quickly add filters in an interactive workshop setting.

Applying Filters

After you have added and configured your filters, you can do three things in the control area of the filter dialog, see Figure 7:

../_images/FilterStack_Apply.png

Figure 7: Applying your filter settings in Disco.

Apply filter (1)

If you press the Apply filter button, then the filter settings will be applied to your current data set. You will be brought back to the analysis view (or the project view, depending on where you came from), and you will see your process map, statistics, variants, etc. based on the cases and events that remain after applying the filters.

As a reminder that you are not looking at the full data set, you can see an indication in the lower left corner that shows you how many % of your cases and events the filtered data set contains in reference to the underlying data set (see Figure 8).

../_images/FilterStack_Outcome.png

Figure 8: The % Cases and % Events indicators in the lower left corner of Disco tell you how much of your original data set remains after applying the filters.

If you click on the Filter symbol again after applying the filter, then you can see all the filters that are currently used and their filter settings (see also Adding Filters and Managing the Filter Stack). To change the filter settings, simply make the changes that you want and press Apply filter again.

To just see a quick overview of which filters are currently applied, you can click on either the % Cases or the % Events indicator for a Filter Summary.

Copy and filter (2)

If you press the Copy and filter button, then your filter settings are applied to a newly created copy of your data set and the original data set remains unchanged. Using the Copy and filter button is recommended for most situations, because this way you can preserve your analysis results and explore your process in many different dimensions without losing any of your previous work.

When you press the Copy and filter button, a little screen comes up as shown in Figure 9. You can provide a meaningful name that reflects the reason why you have applied the filter - see (1) in Figure 9 (Refer to Renaming Projects and Data Sets to see how you can change the name later on).

../_images/CopyFilter.png

Figure 9: Pressing Copy and filter applies the current filter settings to a copy of your data set (and lets the original one unchanged).

Once you press the Create button, the filters will be applied to the new data set and you will find a new entry in the drop-down list at the top of your analysis screen (see Figure 10). This allows you to easily switch back and forth between your different analysis results anytime.

../_images/CopyFilterResult.png

Figure 10: By using the Copy and filter button you will create a “bookmark” of your current analysis view as a separate data set entry in your project.

You don’t need to worry that creating too many copies will overload your Disco project. Disco only stores the difference to the original file (the filters that you have applied) and does not duplicate the underlying data set. So, you can safely use the Copy and filter option also for very large data sets.

Normally, you want to preserve the applied filters (so that you can go back later to inspect or to change them). However, you can also choose to create a new data set that consolidates the applied filters in a permanent way. For example, when you have used the filters for cleaning your data you often want to create a new, “clean” copy as the reference data set for your analysis. To do this, you can select the Apply filters permanently option - see (2) in Figure 9. The filter outcome will be the same, but you will not see the % Cases and % Events indicators (the % indicators have been reset to 100% for the filtered data set) and you will not be able to change the filters anymore. However, as with using Copy and filter without the option Apply filters permanently, the original data set is still there until you delete it explicitly from your project. The consolidation only happens on the new copy you are creating.

Cancel (3)

Pressing the Cancel button simply brings you back to the previous state of the data set. It has the same effect as pressing first Undo changes and then Apply filter.

It is recommended to use the Cancel button if you are just inspecting some details about the filters that you have currently applied to make sure that you have not accidentally changed some of the settings without noticing.

If you apply a filter, or a combination of filters, it can happen that not a single case is left in the remaining data set that fulfills your filter conditions. In this situation, you get the empty filter result screen shown in Figure 11.

../_images/EmptyFilter.png

Figure 11: The empty filter result screen in Disco.

This can be a good thing. For example, if you are an auditor and you are checking for violations of a segregation of duties constraint in your process [2], then this screen means that no such violation actually occurred. You can press the Save report… button to document this result as an audit report (see also Exporting Audit Reports).

If, however, this is not what you have expected, then you have made a mistake in your filter configuration. You can press OK to go back to your filter settings and inspect them to see what went wrong. If you can’t find the problem, refer to “I am getting an empty log: What should I do?” for a trouble-shooting guide.

Combining Filters

When you have multiple filters in your filter stack, then each of the filters is applied to the original data set one after the other from top to bottom when you press the Apply filter or Copy and filter button (see Figure 12).

../_images/ApplyingFilters.png

Figure 12: Multiple filters are applied from top to bottom.

Im many situations, the order does not make a difference because the filters address orthogonal dimensions of your process. For example, if your first filter focuses on all purchase orders from the Netherlands using the Attribute Filter and then a second filter removes all cases that are still in progress with the Endpoints Filter, then the order does not matter. You will get the same result if you apply these two filters the other way around.

However, there can be situations, where the order does matter. Let’s take a look at the different filters that are available in Disco in the table below.

Filter Removes Cases Removes Events
Timeframe Filter Contained in timeframe Trim to timeframe
Intersecting timeframe
Started in timeframe
Completed in timeframe
Variation Filter Yes No
Performance Filter Yes No
Endpoints Filter Discard cases Trim first
Trim longest
Attribute Filter Mandatory Keep selected
Forbidden
Follower Filter Yes No

As you can see, some filters only remove whole cases (for example, the Performance Filter and the Follower Filter) while other filters can also remove events from a case (depending on the filter mode). For the filters that can remove both cases and events (the Timeframe Filter, Endpoints Filter, and Attribute Filter), the table lists the filter modes that remove cases vs. events, respectively.

Note

When a filter that removes events removes all events in a case, the whole case will be removed from the data set.

For most of the filters that remove whole cases, the order does not matter when combining them. However, you can easily see how when you include filters that remove events the order makes a difference.

For example, imagine that you have a customer service refund process that goes through multiple phases. First, the request of the customer is being checked and, if approved, the customer gets their money back. Afterwards, the broken device is being inspected by the technical engineers and documented for future quality control. If you want to analyze the process from a customer perspective, you can trim the process to the part that involves the customer by cutting everything off that happens after the payment activity with the Endpoints Filter.

If you then want to focus on cases, where the time from request to payment took longer than 14 days (using the Performance Filter), then the order really matters. You need to apply the Endpoints Filter first and the Performance Filter afterwards. Otherwise, the Performance Filter selects cases based on the case duration for the whole process before the Endpoints Filter cuts the process to the customer-related process segment afterwards. Refer to Adding Filters and Managing the Filter Stack to see how you can change the order of your filters.

It’s not always easy to understand how multiple filters influence each other. So, if your filter combination does not give you the expected result (or an empty result), you can best investigate the effect of each individual filter in the stack as described in the “I am getting an empty log: What should I do?” trouble-shooting guide.

Configuring Filters Based on the Outcome of Previous Filters

All filters in the filter stack refer to the original log when you configure them (see also Adding Filters and Managing the Filter Stack), which makes it possible for Disco to efficiently filter also very large data sets. Keep in mind that in process mining the filtering and analysis is always performed on the raw data — not based on some aggregated and pre-calculated data structures like in BI tools.

To make your process mining work as explorable, re-usable and efficient as possible, you can quickly change filter settings, combine, move and remove filters, and copy data sets with existing filter stacks as much as you want. Then, only once you apply the filters (see Applying Filters), the actual filtering takes place. Now each filter in the filter stack is applied one after the other. So, the output of the first filter forms the input for the second filter, and so on. See Figure 13 for an illustration of how the filter stack works in Disco.

../_images/Troubleshooting-1.png

Figure 13: How the filter stack works: Filters can be quickly re-configured and moved around also for large data sets because they refer to the original data set. When the filters are applied, the output of the first filter is the input for the second filter and so on.

As shown in Figure 13, the configuration screens of all the filters in the filter list “see” the original data set. In most situations, you will combine filters that are orthogonal to each other by addressing different aspects of the process (timeframe, performance, removing certain activities, etc.), or you can picture the dependencies between the filters, where they exist. So, this is not a problem. However, sometimes you really want to configure a filter (“see” the configuration settings) based on the output produced by the filters before.

For example, let’s take the Purchasing process from the Sandbox (see First Steps after Installation: The Sandbox) again and let’s say you have added a Trim longest Endpoints Filter that focuses the analysis on the requisition and approval phase. You are now interested in the performance of this part of the process and look at the case durations (see Figure 14). We can see that by trimming the process to the requisition phase the average throughput times and the shape of the case duration histogram has changed compared to the full data set (take a look at the Case duration screenshot in Overview Statistics to see the difference).

../_images/Troubleshooting-2a-1.png

Figure 14: After applying the Trim longest Endpoints Filter to focus on the requisition and approval phase of the purchasing process, the case durations have changed.

Now imagine that we want to focus on the 20% slowest cases in the requisition phase. We can add a Performance Filter as a second filter to our data set. However, the Performance Filter settings still display the distribution of the underlying data set (i.e., of the original, full purchasing process). Therefore, also the % cases indicator at the bottom is not showing us the coverage estimate for the requisition phase but for the full process (see Figure 15).

../_images/Troubleshooting-2a-2.png

Figure 15: If the Performance Filter is simply added to the Endpoints Filter, then the Performance Filter settings do not show us the distribution and coverage feedback for the requisition phase but for the underlying data set.

However, what we would like to see instead is the Performance Filter settings based on the outcome of the Endpoints Filter before. You can resolve this by “consolidating” your data set in an intermediate step as follows:

  1. Add all filters that should be applied before the filter that should “see” the outcome of the previous filters (in our example, this would be all filters before the Performance Filter). Then press the Copy and filter button and select the Apply filters permanently option as shown in Step (1) in Figure 16.
You can give the new data set a name that reflects this intermediate filtering state, such as “Requisition Phase only (permanent)”.
  1. Now, you can add the filter that should “see” the outcome of the previous filters (in our example, this is the Performance Filter) to the new data set as shown in Step (2) in Figure 16. Because the previous filters have been applied permanently (and, therefore, the filter outcome has been re-set as the new reference data set), the configuration screen of the filter is now based on the output from the previous filter steps.
In our scenario, this means that the case duration histogram in the background of the Performance Filter now reflects the distribution of the case durations for the requisition phase of the process. Furthermore, focusing on the 20% slowest cases now gives us the 20% slowest cases in the requisition phase as expected.
../_images/Troubleshooting-2.png

Figure 16: If you want the configuration screen of a filter to “see” the output of the previous filters in your list, then you can achieve this by applying the previous filters permanently before you configure that filter.

Other situations in which you might want to apply filters permanently are, for example, if you want to filter the variants using the Variation Filter or Attribute Filter after a combination of other filters, because the variants change when one or more filters are applied (they are always re-calculated on the fly based on the filtered data set). It is also recommended to apply your filters permanently after cleaning up your data set from erroneous timestamps or incomplete cases.

Filter Recommendations

When you open up the filter settings for a new data set, Disco gives you suggestions for filters that may be useful for your log, see (4) in Figure 3. The recommendations are specific to your data set. So, if you open the filter settings for another data set you most likely will see different recommendations.

For example, in Figure 3 the filter recommendations for a refund service process are shown. Disco recommends the Variation Filter because there is a high variation in the behavior, which can lead to large and complex process maps (focusing on the dominant variants can help to simplify the process). Furthermore, the Endpoints Filter is suggested, because there are many different endpoints in the process (due to incomplete cases) and focusing on the expected endpoints will help to show the end-to-end process behavior. Finally, the Performance Filter is recommended, because the cases have highly varying throughput times (focusing either on fast of long-running cases can help to understand the differences).

For another data set you might, for example, get the recommendation to use the Timeframe Filter because Disco has detected that some of the timestamps lie in the future, which usually hints at a data quality problem.

The recommended filters are meant as a starting point. Read the explanation that comes with the recommendation and decide whether you want to look at the filter. If you want to add one of the recommended filters press the Add filter button, or press Add all recommended filters to add all at once.

Note

When you add a recommended filter, it does not do anything yet! You still have to configure the filter to take effect. The filter recommendations are just a hint that is given by Disco to help you get started cleaning and analysing your data.

However, there is no guarantee that Disco will recommend all filters that would be useful to you. This is not possible because a lot of the relevant knowledge about a process is only available to a domain expert. Furthermore, the filters serve as a way to translate your specific process questions into answers in your process analysis. Limiting yourself to the recommended filters only would be a mistake and you would lose out on a lot of the analysis capabilities in your process mining project. See how you can choose from the complete set of filters in Adding Filters and Managing the Filter Stack.

Over time, you will build up experience in working with the filters. You will know which filter you want to use when. There are six different types of filters available in Disco. Refer to individual sections to learn more about how they work:

Filter Summary

If you want to inspect your current filter settings for a data set, you can always click on the filter symbol in the lower left corner to see the full filter stack as shown in Figure 5. However, sometimes you just need a little reminder of what you are looking at exactly. This is where the filter summary comes in handy.

You can see a filter summary for your current data set when click either on the % Cases or the % Events pie-chart indicator next to the filter symbol in the lower right corner (see Figure 17).

../_images/Filter-Summary.png

Figure 17: The filter summary in Disco.

A small window pops up and gives you a condensed, human-readable overview about which filters have been applied. If this is all you needed to know, you can simply click somewhere outside of the window to make it disappear. If you want to take a closer look, you can click on the Configure filters button at the bottom of the filter summary to get to the filter settings.

You can bring up the filter summary both from the project view as well as from any of the three analysis views (Map, Statistics, and Cases).

Recipes: Saving, Sharing, and Re-using Filter Combinations

Because filters are essential for cleaning your data and for answering your analysis questions, it is important that you can store, document, and re-use your filter settings.

Imagine, for example, that you have done an analysis and then a process change has been made to address the discovered problems. Now you have received the new data set and you want to analyze whether the process change was as effective as you have hoped. Of course, you could manually re-create all the filters from your first analysis, but this is time-consuming and you might accidentally introduce some mistakes. You can better use the Recipe functionality in Disco to re-create the exact same analysis on your new data set.

In Disco, a recipe is a combination of filters (including their order and all the filter settings). The recipes can be accessed from the filter settings through the chef’s hat symbol in the lower left corner. Once you click on the chef’s hat the Recipe window appears as shown in Figure 18.

../_images/Recipe-1.png

Figure 18: The Recipes allow you to save, share, and re-use filter combinations in your analysis.

The Recipe window contains the following main controls:

Recipe Categories (1)

You can explore your recipes in five different categories: Current, Project, History, Favorites, and Matches. The following recipe categories are available (click on the corresponding tab at the top to change between them):

  • Current: In the Current tab of the Recipe window you can see a summary of the filters that are in the filter stack of your data set at the moment (shown in Figure 18). From here, you can export (see Exporting Filter Settings) or save them as a bookmark (see Saving Your Filter Settings).
  • Project: In the Project tab, you see all the filter combinations that have been applied to other data sets in your workspace. You can use them to re-apply filters that you have previously used in another data set (see also Re-using Filter Settings From Other Data Sets in Your Project).
  • History: The History tab shows you the last 20 filter combinations that you have applied in Disco. This can be handy if you just want to look up a previously used filter combination, or if you have accidentally deleted or changed the filter settings of a data set (see also Using the Recipe History to Rescue Deleted or Changed Filters).
  • Favorites: The Favorites tab contains all the filter combinations that you have “bookmarked” for later use (see Saving Your Filter Settings). To use one of the saved filters on a new data set, simply select it and press the Apply button.
  • Matches: Each recipe in Disco features a star rating between 0 and 5 stars. The star rating indicates how applicable the recipe is for your current data set. The Matches tab shows you all recipes from the Project, History, and Favorites tabs combined that have three or more stars.
Favorite button (2)
To bookmark and store a filter combination on your computer, you can use the Favorite button. Simply click on the Favorite symbol and the currently selected recipe is added to your list of Favorites. This way, you can access it later to re-use it at any time. Refer to Saving Your Filter Settings to learn how to keep bookmarks of your filter settings using the Favorite recipes.
Export button (3)
To save and store a filter combination outside of Disco (or share it with a colleague), you can click on the Export button (see also Exporting Filter Settings). This will open a file dialog window and lets you choose the location to which the .recipe file should be saved. The .recipe file is an XML-based, computer-readable file that contains all the details of your filter settings. It can be imported to Disco using the Load button - see (5) in Figure 18 on your computer, or an any other computer.
Apply button (4)
To apply a selected recipe on your current data set, simply press the Apply button. The filters will appear in your filter stack and you can check whether the filter settings are still correct for your new data set. For example, there might be additional activities contained in the new data that are not yet included in your filter configuration but should be. After you have verified the filter settings, you can press the Apply filter or Copy and filter button to filter the data set based on this recipe.
Load button (5)
The Load button lets you import a .recipe file that has been exported previously (see also Exporting Filter Settings). This can either be a recipe that you have saved yourself, or one that you have received from a colleague.
Exit button (6)
To close the recipe window, you can either click on the little x button in the upper right corner. Alternatively, you can click anywhere outside of the recipe window to make it disappear.

Now let’s take a look at some typical recipe usage scenarios.

Saving Your Filter Settings

One of the common scenarios is that you want to explicitly save a filter combination for later re-use. You can go to the Current tab in the Recipe window to see a summary of the filters that are in the filter stack of your data set at the moment (see Figure 19).

../_images/Recipe-1b.png

Figure 19: The Current tab shows you a filter summary for your current data set. You can then export or save the filter combination as a Favorite.

To save this filter combination, you could use the Export button to store a .recipe file and Load it later (see also Exporting Filter Settings). However, if you want to save the filter combination just for yourself, it will be much quicker to simply use the Favorite button.

You can get to all your saved recipes by changing to the Favorites tab (see Figure 20). There, you can also click on the name of the recipe and change it. This helps you to give the saved recipe a meaningful name (the default is the current name of your data set). You can press the Unfavorite symbol to undo the bookmarking once you no longer need a particular recipe.

../_images/Recipe-2.png

Figure 20: The Favorites tab shows you all the recipes that you have saved as a favorite (works across different projects).

Your favorite recipes are stored in your Disco settings. This means that they are kept persistent across multiple analysis sessions, and across projects. However, if you use Disco on multiple computers then be aware that each installation of Disco will keep their own favorites. To synchronise them, or to move recipes between computers, you can always use the Export and Load functionality (see Exporting Filter Settings).

Re-using Filter Settings From Other Data Sets in Your Project

Another common scenario is that you have received a fresh data set and want to repeat an analysis that you had done earlier. For example, imagine that based on your initial analysis a process change was made and now you want to check whether the change was as effective as predicted. Or you have received a new data extract that now hopefully fixes the data quality issues that you had discovered in the previous version. Either way, you would like to repeat certain analyses that you have already performed on an older version of this data set.

When you import the data set and go to the recipes, then the list of filters for the current data set is still empty (see Figure 21).

../_images/Recipe-3a.png

Figure 21: The Current tab is empty if you have not yet added any filters to your data set.

You could then choose a saved filter combinations from your Favorites tab and apply it. However, you don’t even need to have saved a recipe as a favorite. Most likely, you have the filtered data sets from your previous analyses still in your workspace, so you can simply re-use these filter combinations from the other data sets in your project. To do this, you can change to the Project tab as shown in Figure 22.

../_images/Recipe-3b.png

Figure 22: The Project tab shows you all data sets in your current project that have one or more filters applied. Simply choose the recipe that you want to re-use and apply it to your new data set.

To help you to quickly find the data set that you are looking for, Disco uses a 0 to 5 star rating that shows you how much the recipes in your project match your current data set. For example, if a data set has been filtered based on activities that don’t even exist in your current data set, then Disco will display a low star rating for this recipe. This is particularly useful if you have been working with different data sets and lets you focus on the recipes that are most relevant to your new data set.

Once you have selected the recipe that you want to re-use, simply press the Apply button to add the filter combination to your current data set. After you have applied a recipe, you can still inspect the filters to make sure they are configured the way you want them (see Figure 23). Make sure to check the filter settings, because your new data set might be different and, for example, contain additional activities that were not present in the previous version.

../_images/Recipe-3c.png

Figure 23: After adding the filters to your data set, make sure to check to check the filter settings to ensure they are right for your new data set. You can then either apply them directly (Apply filter) or store the filter result as a new data set (Copy and filter).

Using the Recipe History to Rescue Deleted or Changed Filters

Next to re-using recipes from your Favorites tab or from the Project tab, you can also see a list of the 20 last applied filter combinations in the History tab (see Figure 24).

../_images/Recipe-4.png

Figure 24: The History tab shows you the 20 last applied filter combinations.

Also the recipes shown in the History tab (as well as in the Favorites tab) use the star-rating to help you to find the most relevant filter combinations for your data set quickly.

The History recipes are particularly useful if you have inadvertently deleted a data set, or accidentally changed the filter settings, and now want to restore it. Simply find the filter combination that you have lost and apply it to your current data set by pressing the Apply button.

“I am getting an empty log: What should I do?”

Sometimes, an empty log (see Figure 11) just tells you that the answer to the question you asked is “no”. For example, if you create a filter to find all cases that have been completed this month and that took longer than 30 days (to follow up with the customers and give them a present to compensate for their slow service), and the result is empty, then this simply means that there are no such cases in your data set. Especially if you are testing your data set for the violation of a compliance rule (like a segregation of duties, or 4-eyes principle, constraint [2]) this is actually a good result, because your process has not violated that rule.

However, sometimes you may be getting an empty log and you don’t know what went wrong. You expect a different result and you know that you have made a mistake in the configuration of your filters.

To find the mistake, we recommend to track down the problem by looking at your filter stack step-by-step in the following way:

  1. Save the filter combination that gives you the unexpected “No cases passed your filter” result as a Favorite recipe (see Saving Your Filter Settings). You can call it “Test filters” or something similar to remember why you are saving it.
  2. Then, make a copy of your current data set for testing the problem (see Copying Data Sets if you don’t know yet how to copy a data set). Again, you can give the copy a name like “Test”, so that you know you can remove it later on.
  3. On the test data set, apply the filter combination from your saved Favorite recipe and remove all but one filter to test it in isolation (just keep the one you are testing). Apply the filter and check whether it gives you the expected result. Take a look at the different data set views (Process map, Statistics, Cases view) to inspect the effect this filter has had on your data set. If everything looks good, go back to your filter settings, apply the saved Favorite recipe again and test the second filter in isolation. By going through the individual filters in your stack in this way, one by one, it is likely that you find the problem already in this step.
  4. If each of the filters in isolation works just as expected, it’s time to look at the combination of your filters: Start by adding just the first and the second filter from your filter list. Does this combination give you the expected result so far? If yes, add and apply the third filter and check the resulting log in the different data set views again. Continue until you find the filter that is not giving you the expected output.
  5. If after you found the problematic filter, you still can’t see why the filter does not give you the result you expected, make a copy of your test data set by applying all filters before this problematic filter using Copy and filter and ticking the Apply filters permanently option (see also Configuring Filters Based on the Outcome of Previous Filters). Give the copy a name like “Testing filter X” to remember what you created the copy for. Now add just your problematic filter to the new data set “Testing filter X”. Inspect the filter settings. Is anything different in the filter settings after you have applied the previous filters in a permanent manner? Refer to the manual for the corresponding filter (see Filtering) if you are not sure what it does.

“Why is the preview different from the % Cases after applying the filter?”

There are two reasons why the coverage indicator in the Performance Filter (or the Variation Filter) settings might give you a different result than the % Cases that you see after you have applied the filter:

  1. The coverage indicator in the filter settings shows you an estimate of the percentage of cases that are covered by your selection. The goal is to help you assess how many cases you are about to filter. However, the precise percentage can only be determined by actually filtering your data set (keep in mind that during the configuration of your filter settings no actual filtering takes place yet - see also Adding Filters and Managing the Filter Stack and Applying Filters).

In most situations, the estimate is pretty accurate. Disco splits up your data set into buckets, and creates the estimate based on the blue buckets that are covered by your selection. However, sometimes you have a data set with a distribution, where small changes in the selection boundaries have a big impact on the case coverage of the selection.

For example, take a look at the call center example in Figure 25. The shape of the case duration histogram shows that there are many cases that are very fast (running just up to 12 hours) and then there is a long tail of slower cases.

../_images/Troubleshooting-Percentage-1.png

Figure 25: The coverage indicator estimates the % Cases based on the blue bars covered by your selection.

Now, when we change the lower boundary of the selection to all cases between 1 hour and 12 hours, then the estimate of the coverage indicator does not change (73%), because the estimate is still based on the same, very high blue bar in the histogram (see Figure 26).

../_images/Troubleshooting-Percentage-2.png

Figure 26: When you change the lower or upper boundary of the filter, but don’t trigger a different selection of bars in the histogram, then the estimate stays the same.

However, once we apply the filter we can see that only 3% of the cases took between 1 hour and 12 hours (see Figure 27).

../_images/Troubleshooting-Percentage-3.png

Figure 27: The true coverage can only be seen after filtering (here 3% of the cases took between 1 hour and 12 hours).

The reason is that 70% of the cases actually took less than 1 hour (see Figure 28). The estimate for both filters (less than 1 hour and between 1 and 12 hours) was 73%, because it was based on the full bucket (between 0 and 12 hours) in the histogram.

../_images/Troubleshooting-Percentage-4.png

Figure 28: The true coverage can only be seen after filtering (here 70% of the cases took less than 1 hour).

This example shows that if you place your Performance Filter boundaries not at the border but in the middle of a large bucket in the reference chart, there can be some discrepancies between the estimate and the actual filter coverage. If you should get two conflicting numbers, then always refer to the % Cases indicator that is shown after filtering.

  1. A second reason for a difference in percentages can be that you have other filters in your filter stack before the Performance Filter. Refer to Configuring Filters Based on the Outcome of Previous Filters to learn how you can align your Performance Filter settings view with the outcome of previous filters.

Footnotes

[1]This has the same effect as if you would press the Cancel button in the lower left corner and afterwards enter the filter settings again.
[2](1, 2) Learn more about how to check segregation of duty constraints with Disco in the following article: http://fluxicon.com/blog/2014/03/how-to-check-segregation-of-duties-with-disco/.