Scenario #1: Unintentionally duplicated
files.
Your development team has a library of standard files that are intended to be
used through the numerous projects they will be working on. These are
typically global libraries that address needs such as standard data access,
network access, global variables, etc. These files are shared in tens, if
not hundreds, of projects.
A team member, while creating a new project, gets the latest copy of
these library files he will need to support the project. He didn't create
the project and then share-in the needed files. He created the
project outside SourceSafe, got the files he needed from SourceSafe and
then checked-in the entire new project to SourceSafe.
What just happened? Well, SourceSafe has no way of knowing that the files
just added are copies of already existing files, so SourceSafe assumes
they are new files. SourceSafe now contains an exact copy of a version of the
original file. Or even worse, modifications were made to file
before the project was checked-in
casting doubt on true origin of the code.
Why is this bad? Because any changes made to the original file will not
be propagated to this copy and vice versa. This project will never
benefit from the sharing of files because these files are not shared.
Any bugs that are corrected in the original file will still exist in the
duplicated file. Any enhancements made to the original file will not
propagate to the duplicated file. Yet from the users view of SourceSafe,
it can be very difficult to identify that the duplicated file is not the
same as the original file.
This condition may go unnoticed (and typically does) for some time. As new
projects are added (using the proper way by sharing-in files), an
unsuspecting user may be sharing the duplicated file! The problem
is getting worse and the potential maintenance costs are rising rapidly.
Scenario #2: Unintentionally branched files
A user is maintaining a project through the course of time. This
particular change requires a change to a shared file, so the user makes the
needed changes and checks them into SourceSafe. Later on, due to QA
problems, specification changes, etc., the user (or another user) determines
that those changes need to be undone. So the project is opened up, the
shared file selected, and it is rolled back to the previous version.
What does SourceSafe do? SourceSafe properly issues a warning along
the lines of: "This action will cause of branching of the shared file. Do
you wish to continue?" Now, because the user just doesn't
understand the impact, or because he feels he has no choice because he needs to
restore the shared file to its former state, this user chooses to continue with
the rollback.
What just happened? Well, just as SourceSafe said, it branched that
shared file. So what? As you know, that means from the branched
version on, these two files are now unique, i.e. different files. Just as
in scenario #1 above, changes made to one file will not be automatically
propagated to the other file.
Can it get worse?
It sure can. As developers, we tend to hover around the things we
are familiar with. So if a user has a new project that needs to share a
library file, he tends to go to the last project he worked on (because it is
fresh in his mind) to find and share it. This time he does it right,
using all the best practices published by Microsoft and your internal
procedures. Unfortunately, the file being shared has previously been
unintentionally duplicated/branched file. And the madness continues to
grow!
AND, if you are using one of those third-party reporting tools to
determine who is doing what kind of work, then you have these inaccuracies
automatically built into the data:
-
Duplicated files appear as though the user that added them to the
project wrote the code. In fact, all he did was a get and an Add File.
There should be no credit for writing new code!
-
Each branched file reports all the changes made all the way back to
version 1. This means that if a file was unintentionally branched at
version 25, then, because SourceSafe (and the third-party reporting tools) see
them as them as two separate files, all changes from version 25 to version 1
will be reported twice. If either file is branched again, the
entire history from the branch point back will be reported again.
What should we do? The manual
approach.
Feeling lucky and having some time on your hands, you remember that you know
there are improperly duplicated files in your SourceSafe repository. You
decide to tackle the problem using SourceSafe. Using the features of
SourceSafe you:
-
Do a wildcard search for all projects that use a single, specific
suspected duplicated file name.
-
Wait three minutes for the results of the search (maybe more, maybe less
depending on your repository size and criteria)
-
Click the Date/Time header to sort the results by Date/Time, but it
doesn't sort (Is that a bug?).
-
So you manually scan the (potentially) hundreds of entries to determine the
timestamp of the correct version. (Most frequent timestamp wins!)
-
Then you manually scan the (potentially) hundreds of entries for differing
timestamps in order to identify files that need to have the sharing
corrected.
-
You find a project, but:
-
This project is not yours. (Make a note to self to notify the team lead
of that project.)
-
Wait, this file is pinned. Don't change it just yet because you don't know if
it is the correct file that is pinned or the duplicated/branched file that is
pinned. Nor do you know the reason it was pinned. (You hope it is
in the comments!)
-
Ah, here's an entry that is in one of your projects and is not pinned. Take
some action!
-
The icon (add icon)
suggests this file is not shared. Good. That makes it easier because this
file has the same name as one of your standard library files, so
it is very suspect.
-
You want to just go the project, delete the bad file and then share the good
file, but being a team lead, you don't want to cause any more work than
needed. Therefore, you decide to compare the two files.
-
Click one file, ctrl-click the other file, right click to get context menu,
choose Show Differences
and, voila, you can see all the differences.
-
Well of course they are different. Time has lapsed and files have changed
since the offending action occurred.
-
But SourceSafe provides no easy method compare all versions
of files for potential exact matches.
-
After some consideration of the potential impact of making the change, you opt
to not
make the change.
-
So, what kind of tool do you need to be confident that any change you make will
not immediately break your project? (SSAnalyzer!)
-
You have just spent 5-10 minutes (again!), just to determine that you
dare not make these types of changes for fear of what you might break, yet
knowing that if you don't, there will be potentially costly problems in the
future if you don't address the problem. Who can help? (SSAnalyzer!)
-
So, you just spent at least 10 minutes to determine that you dare not change
this one problem. Now, using our estimate of possibly 20% of your
SourceSafe repository suffering from similar problems and applying some quick
math, you determine that you could spend several man-weeks fixing these
problems (if only you could be certain of the fixes).
-
Well, with these numbers you decide to bite the bullet and wait for all those
horrid maintenance problems to pop-up. Then your team and you will fix
them (they will have a priority!), you can explain to several layers of
management why the same bug appeared again, or why that enhancement didn't make
into all projects, and explain to your family why you must work late (again!).
Or you can find another solution. (SSAnalyzer!)
What should we do? SSAnalyzer!
One of the key problems with the manual approach to resolving these problems is
that it requires that a user at least suspect that improperly
duplicated/branched files exist and then spend inordinate amounts
of time and analysis to rectify the problems.
SSAnalyzer provides an automated means by which to identify, deeply
research and quickly and confidently correct the problems of unintentionally
duplicated and/or branched files.
Here's how:
-
SSAnalyzer creates a database of all the SourceSafe transactions in your
SourceSafe repository.
-
This means that the SQL tables contain all the adds, shares, check-ins,
deletes, purges, pins, etc, but none of the actual source code changes.
-
SSAnalyzer identifies all unique files that share the same name. These
files are considered possible problem areas.
-
These same-named
files are presented to the user, along with complete statistics and information
as to how many projects share them, how many are active (vs. deleted) and
specifically in which projects they are used.
-
Through SSAnalyzer, the user can:
-
Identify which unique file is the primary file (the correct file). There
could be more than one. The user has complete control.
-
Compare all versions of the primary file to the suspected bad file.
-
Identify the most recent version of the primary file that exactly
matches the latest
version of the bad file.
-
Permit the user to flag the files (and projects) that should be changed with
either a Replace or Replace/Pin.
-
Provide a means to compare any versions of like-named files. This
includes detection of effective changes vs. actual
changes.
-
Once actions are selected via the SSAnalyzer GUI, the changes are implemented
quickly, accurately and with confidence that the results will have no immediate
impact on any project. What are the impacts of
SSAnalyzer?
What are the impacts of SSAnalyzer?
If you agree with following concepts, then you will completely trust
SSAnalyzer to modify your SourceSafe repository:
-
If the latest version of the bad file exactly matches the latest version
of the primary file, then we can safely and confidently, in each project:
-
Delete
the bad file
-
Share
in the good file
-
Optionally, purge the bad file (not recommended)
-
If the latest version of the bad file exactly matches a version not the
latest of the primary file, then we can safely and confidently, in each
project:
-
Delete
the bad file
-
Share
in the good file
-
Pin
the file at the version at which the match occurred
-
Optionally, purge the bad file (not recommended)
How else does SSAnalyzer help?
First, SSAnalyzer causes no loss of SourceSafe data except in
one very specific case.
-
In the case where a given project already contains a deleted file with the same
name as the one being replaced, SSAnalyzer will:
-
Purge that deleted file (because SourceSafe does not support more than one
delete file with the same name)
-
Delete the file to be replaced. This file could be recovered, if
necessary.
Second, every change that SSAnalyzer makes to your
SourceSafe repository is logged in a TO-DO list. This list can be approved at
the:
-
file level - the user independently determined that the change was valid.
-
project level - typically in this instance the project has been rebuilt, QA'd
and determined to work properly.
How quickly can SSAnalyzer help?
First, you need to ensure you have access and rights to the IT features that
SSAnalyze requires. These are:
-
Access to SourceSafe. Is it installed on your PC and do you have a user ID and
password?
-
Access to MS-SQL and privileges to create a new database (See your database
administrator. Most development shops have one in-house.)
-
Once the database is created, you just need normal select/update/delete rights.
Once you have satisfied these requirements, here are some time estimates based
on the following configuration:
-
a P4-300MHz, 1 GB RAM.
-
SourceSafe repository with about 75,000 files and 3,500 projects
Time estimates:
-
Install SSAnalyzer - less than 5 minutes
-
Create empty SSAnalyzer database - less than 1 minute
-
Import SourceSafe transaction data - less than 10 minutes.
-
The following are in our recommended sequence of actions:
-
Purge deleted projects
-
Typical rate of 1.3 projects per second (about 1.2 minutes per 100).
-
Typically a small number of these (less than 100)
-
Overiew of Purge Deleted Projects
-
Purge deleted files
-
Typical rate is 3.0 - 3.5 files per second (about 5 minutes per 1000).
-
May be hundreds or thousands of these.
-
Overview of Purge Deleted Files
-
Identify and correct files with duplicate names with exact matching versions.
-
Scanning rate depends on number of versions per file.
-
Typical scanning rate is about 15 files per second (about 1.1 minutes per
thousand).
-
Applying recommended actions run at 10 per second.
-
Overview of Cleaning Up Duplicate File Names
-
Identify and correct files with duplicate names using a best match (manual
review required).
-
This scan looks for closely matching files so you can more quickly review
likely candidates to be replaced, merged, etc.
-
Most of your time will be spent reviewing the results of this scan.
-
Optionally rename files as deemed necessary during steps 3 and 4 above.
-
If, while performing steps 3 or 4 above, you determine that a particular file
should be renamed because it's name matches a standard file name or any other
reason you may quickly rename it, thereby removing it from the the
duplicate file name actions. (Unless of course you give it a name that is
already in use, but then SSAnalyzer will mention that to you.)
-
This all boils down to that it is most likely that in less than 30 minutes
you will have:
-
corrected many of your duplicate/branch file name problems.
-
a good feel for how your naming conventions/standards are being enforced
-
a good idea of extensive any remaining problems may be.
How to cost justify SSAnalyzer?
This part is easy! For most organizations, the hourly cost per software
developer is a bit over $50 per hour. If that seems high, think about
salary, benefits, office space, etc. because that is what is included in the
average. SSAnalyzer has a list price of $200. So that means you
have to save just four hours in order to justify the cost of SSAnalyzer.
Here are some questions to help you:
-
How much time is spent fixing the average bug?
-
How many unintentional different versions of the same file(s) are in your
repository?
-
How many hours have you spent explaining to management (team lead, department
head, VP) about missed deadlines due to unexpected problems?
-
Sure you have team leads and managers for individual projects and they have
tools to help them, but who oversees your entire (or at least your common) base
of code? What tools do they have to help them?
-
Wouldn't you agree that it is better management to know
the extent of the problems in your repository than not to know?
-
Doesn't it sound as though a tool like SSAnalyzer would be of
great benefit during your next maintenance cycle? (You do do those, don't you?)
File Comparator
SSAnalyzer has a built-in file comparator with some advanced features
that help in several ways:
-
Determine the true differences between file versions by:
-
Providing two user-selectable views of each comparison:
-
All changes - like in SourceSafe.
-
Effective changes - highlights only changes that have an effect on the code.
Some examples are:
-
Added/Deleted lines that contain only comments are not highlighted.
-
We provide basic lexing for most popular languages: C++, Java, C#, C, VB,
module def, and plain text
-
This covers 5 of the top 10 programming languages according to
TIOBE Programming Community Index
-
Perl and PHP will be added soon
-
Modifications that are merely a split/merge of two or more lines are not
highlighted.
-
Added/Deleted lines that are empty are not highlighted.
-
Where SourceSafe may simply identify two groups of lines as modified,
SSAnalyzer attempts to match the proper lines that were modified and identify
the remaining lines as adds/deletes.
-
SSAnalyzer tries not to match on empty or punctuation-only lines. This usually
results in truer difference reporting.
-
The user can toggle between the two modes and immediately note the differences,
if any.
-
Provides a GUI that allows quick comparison of all versions between two files
(or the same file)
-
Files are presented in a side-by-side view, similar to SourceSafe
-
Controls are provided independently for each file to go to the first, last,
next or previous version with a single mouse click. The user may also go to a
specific file version.
-
File version dates are provided for quick reference and comparison.
-
Provides quick access to identified changes via first, last, next and previous
error buttons.
-
Provides change statistics in status bar. This includes:
-
Total number of lines in each file version being viewed
-
Number of matched lines
-
Number of added lines
-
Number of modified lines in each file
-
Number of change groups in the comparison
-
Identification of moved lines has not yet been implemented
-
Double-clicking a single line provides pop-up showing source line above the
destination line allowing for easier visual comparison to determine what
modifications have been made. We hope to extend this to multiple lines in the
future.
-
No text search feature yet, but it is planned for.
Security concerns?
Rest assured, SSAnalyzer does not store any passwords
you enter. These passwords are used only to connect to either SQL
database or your SourceSafe repository. When connecting to SQL database,
SSAnalyzer checks the target database to determine if it is an SSAnalyzer
database. If it is not, then the connection will be aborted. Access
to SourceSafe requires only a typical SourceSafe user ID and password.
You can use an existing ID, but may want to consider a new ID that will be used
with SSAnalyzer so you can at least track any changes
made using SSAnalyzer .