mirror of
https://github.com/NohamR/Stage-2024.git
synced 2025-05-24 00:49:06 +00:00
clean
This commit is contained in:
parent
5c9313c4ca
commit
3cf13b815c
13
.gitignore
vendored
13
.gitignore
vendored
@ -127,7 +127,16 @@ dmypy.json
|
||||
|
||||
# node
|
||||
node_modules/
|
||||
*.tar
|
||||
|
||||
/test
|
||||
# /yolov7-setup
|
||||
/yolov7-tracker-example
|
||||
*.tar
|
||||
|
||||
# /yolov7-tracker-example
|
||||
/yolov7-tracker-example/cfg/training/yolov7x_dataset1_2024_06_19.yaml
|
||||
/yolov7-tracker-example/data/dataset1_2024_06_19
|
||||
/yolov7-tracker-example/runs
|
||||
/yolov7-tracker-example/tracker/config_files/dataset1_2024_06_19.yaml
|
||||
/yolov7-tracker-example/wandb
|
||||
/yolov7-tracker-example/info_SF.txt
|
||||
/yolov7-tracker-example/400m.mp4
|
674
yolov7-tracker-example/LICENSE.md
Normal file
674
yolov7-tracker-example/LICENSE.md
Normal file
@ -0,0 +1,674 @@
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 3, 29 June 2007
|
||||
|
||||
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The GNU General Public License is a free, copyleft license for
|
||||
software and other kinds of works.
|
||||
|
||||
The licenses for most software and other practical works are designed
|
||||
to take away your freedom to share and change the works. By contrast,
|
||||
the GNU General Public License is intended to guarantee your freedom to
|
||||
share and change all versions of a program--to make sure it remains free
|
||||
software for all its users. We, the Free Software Foundation, use the
|
||||
GNU General Public License for most of our software; it applies also to
|
||||
any other work released this way by its authors. You can apply it to
|
||||
your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
them if you wish), that you receive source code or can get it if you
|
||||
want it, that you can change the software or use pieces of it in new
|
||||
free programs, and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to prevent others from denying you
|
||||
these rights or asking you to surrender the rights. Therefore, you have
|
||||
certain responsibilities if you distribute copies of the software, or if
|
||||
you modify it: responsibilities to respect the freedom of others.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must pass on to the recipients the same
|
||||
freedoms that you received. You must make sure that they, too, receive
|
||||
or can get the source code. And you must show them these terms so they
|
||||
know their rights.
|
||||
|
||||
Developers that use the GNU GPL protect your rights with two steps:
|
||||
(1) assert copyright on the software, and (2) offer you this License
|
||||
giving you legal permission to copy, distribute and/or modify it.
|
||||
|
||||
For the developers' and authors' protection, the GPL clearly explains
|
||||
that there is no warranty for this free software. For both users' and
|
||||
authors' sake, the GPL requires that modified versions be marked as
|
||||
changed, so that their problems will not be attributed erroneously to
|
||||
authors of previous versions.
|
||||
|
||||
Some devices are designed to deny users access to install or run
|
||||
modified versions of the software inside them, although the manufacturer
|
||||
can do so. This is fundamentally incompatible with the aim of
|
||||
protecting users' freedom to change the software. The systematic
|
||||
pattern of such abuse occurs in the area of products for individuals to
|
||||
use, which is precisely where it is most unacceptable. Therefore, we
|
||||
have designed this version of the GPL to prohibit the practice for those
|
||||
products. If such problems arise substantially in other domains, we
|
||||
stand ready to extend this provision to those domains in future versions
|
||||
of the GPL, as needed to protect the freedom of users.
|
||||
|
||||
Finally, every program is threatened constantly by software patents.
|
||||
States should not allow patents to restrict development and use of
|
||||
software on general-purpose computers, but in those that do, we wish to
|
||||
avoid the special danger that patents applied to a free program could
|
||||
make it effectively proprietary. To prevent this, the GPL assures that
|
||||
patents cannot be used to render the program non-free.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
TERMS AND CONDITIONS
|
||||
|
||||
0. Definitions.
|
||||
|
||||
"This License" refers to version 3 of the GNU General Public License.
|
||||
|
||||
"Copyright" also means copyright-like laws that apply to other kinds of
|
||||
works, such as semiconductor masks.
|
||||
|
||||
"The Program" refers to any copyrightable work licensed under this
|
||||
License. Each licensee is addressed as "you". "Licensees" and
|
||||
"recipients" may be individuals or organizations.
|
||||
|
||||
To "modify" a work means to copy from or adapt all or part of the work
|
||||
in a fashion requiring copyright permission, other than the making of an
|
||||
exact copy. The resulting work is called a "modified version" of the
|
||||
earlier work or a work "based on" the earlier work.
|
||||
|
||||
A "covered work" means either the unmodified Program or a work based
|
||||
on the Program.
|
||||
|
||||
To "propagate" a work means to do anything with it that, without
|
||||
permission, would make you directly or secondarily liable for
|
||||
infringement under applicable copyright law, except executing it on a
|
||||
computer or modifying a private copy. Propagation includes copying,
|
||||
distribution (with or without modification), making available to the
|
||||
public, and in some countries other activities as well.
|
||||
|
||||
To "convey" a work means any kind of propagation that enables other
|
||||
parties to make or receive copies. Mere interaction with a user through
|
||||
a computer network, with no transfer of a copy, is not conveying.
|
||||
|
||||
An interactive user interface displays "Appropriate Legal Notices"
|
||||
to the extent that it includes a convenient and prominently visible
|
||||
feature that (1) displays an appropriate copyright notice, and (2)
|
||||
tells the user that there is no warranty for the work (except to the
|
||||
extent that warranties are provided), that licensees may convey the
|
||||
work under this License, and how to view a copy of this License. If
|
||||
the interface presents a list of user commands or options, such as a
|
||||
menu, a prominent item in the list meets this criterion.
|
||||
|
||||
1. Source Code.
|
||||
|
||||
The "source code" for a work means the preferred form of the work
|
||||
for making modifications to it. "Object code" means any non-source
|
||||
form of a work.
|
||||
|
||||
A "Standard Interface" means an interface that either is an official
|
||||
standard defined by a recognized standards body, or, in the case of
|
||||
interfaces specified for a particular programming language, one that
|
||||
is widely used among developers working in that language.
|
||||
|
||||
The "System Libraries" of an executable work include anything, other
|
||||
than the work as a whole, that (a) is included in the normal form of
|
||||
packaging a Major Component, but which is not part of that Major
|
||||
Component, and (b) serves only to enable use of the work with that
|
||||
Major Component, or to implement a Standard Interface for which an
|
||||
implementation is available to the public in source code form. A
|
||||
"Major Component", in this context, means a major essential component
|
||||
(kernel, window system, and so on) of the specific operating system
|
||||
(if any) on which the executable work runs, or a compiler used to
|
||||
produce the work, or an object code interpreter used to run it.
|
||||
|
||||
The "Corresponding Source" for a work in object code form means all
|
||||
the source code needed to generate, install, and (for an executable
|
||||
work) run the object code and to modify the work, including scripts to
|
||||
control those activities. However, it does not include the work's
|
||||
System Libraries, or general-purpose tools or generally available free
|
||||
programs which are used unmodified in performing those activities but
|
||||
which are not part of the work. For example, Corresponding Source
|
||||
includes interface definition files associated with source files for
|
||||
the work, and the source code for shared libraries and dynamically
|
||||
linked subprograms that the work is specifically designed to require,
|
||||
such as by intimate data communication or control flow between those
|
||||
subprograms and other parts of the work.
|
||||
|
||||
The Corresponding Source need not include anything that users
|
||||
can regenerate automatically from other parts of the Corresponding
|
||||
Source.
|
||||
|
||||
The Corresponding Source for a work in source code form is that
|
||||
same work.
|
||||
|
||||
2. Basic Permissions.
|
||||
|
||||
All rights granted under this License are granted for the term of
|
||||
copyright on the Program, and are irrevocable provided the stated
|
||||
conditions are met. This License explicitly affirms your unlimited
|
||||
permission to run the unmodified Program. The output from running a
|
||||
covered work is covered by this License only if the output, given its
|
||||
content, constitutes a covered work. This License acknowledges your
|
||||
rights of fair use or other equivalent, as provided by copyright law.
|
||||
|
||||
You may make, run and propagate covered works that you do not
|
||||
convey, without conditions so long as your license otherwise remains
|
||||
in force. You may convey covered works to others for the sole purpose
|
||||
of having them make modifications exclusively for you, or provide you
|
||||
with facilities for running those works, provided that you comply with
|
||||
the terms of this License in conveying all material for which you do
|
||||
not control copyright. Those thus making or running the covered works
|
||||
for you must do so exclusively on your behalf, under your direction
|
||||
and control, on terms that prohibit them from making any copies of
|
||||
your copyrighted material outside their relationship with you.
|
||||
|
||||
Conveying under any other circumstances is permitted solely under
|
||||
the conditions stated below. Sublicensing is not allowed; section 10
|
||||
makes it unnecessary.
|
||||
|
||||
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
|
||||
|
||||
No covered work shall be deemed part of an effective technological
|
||||
measure under any applicable law fulfilling obligations under article
|
||||
11 of the WIPO copyright treaty adopted on 20 December 1996, or
|
||||
similar laws prohibiting or restricting circumvention of such
|
||||
measures.
|
||||
|
||||
When you convey a covered work, you waive any legal power to forbid
|
||||
circumvention of technological measures to the extent such circumvention
|
||||
is effected by exercising rights under this License with respect to
|
||||
the covered work, and you disclaim any intention to limit operation or
|
||||
modification of the work as a means of enforcing, against the work's
|
||||
users, your or third parties' legal rights to forbid circumvention of
|
||||
technological measures.
|
||||
|
||||
4. Conveying Verbatim Copies.
|
||||
|
||||
You may convey verbatim copies of the Program's source code as you
|
||||
receive it, in any medium, provided that you conspicuously and
|
||||
appropriately publish on each copy an appropriate copyright notice;
|
||||
keep intact all notices stating that this License and any
|
||||
non-permissive terms added in accord with section 7 apply to the code;
|
||||
keep intact all notices of the absence of any warranty; and give all
|
||||
recipients a copy of this License along with the Program.
|
||||
|
||||
You may charge any price or no price for each copy that you convey,
|
||||
and you may offer support or warranty protection for a fee.
|
||||
|
||||
5. Conveying Modified Source Versions.
|
||||
|
||||
You may convey a work based on the Program, or the modifications to
|
||||
produce it from the Program, in the form of source code under the
|
||||
terms of section 4, provided that you also meet all of these conditions:
|
||||
|
||||
a) The work must carry prominent notices stating that you modified
|
||||
it, and giving a relevant date.
|
||||
|
||||
b) The work must carry prominent notices stating that it is
|
||||
released under this License and any conditions added under section
|
||||
7. This requirement modifies the requirement in section 4 to
|
||||
"keep intact all notices".
|
||||
|
||||
c) You must license the entire work, as a whole, under this
|
||||
License to anyone who comes into possession of a copy. This
|
||||
License will therefore apply, along with any applicable section 7
|
||||
additional terms, to the whole of the work, and all its parts,
|
||||
regardless of how they are packaged. This License gives no
|
||||
permission to license the work in any other way, but it does not
|
||||
invalidate such permission if you have separately received it.
|
||||
|
||||
d) If the work has interactive user interfaces, each must display
|
||||
Appropriate Legal Notices; however, if the Program has interactive
|
||||
interfaces that do not display Appropriate Legal Notices, your
|
||||
work need not make them do so.
|
||||
|
||||
A compilation of a covered work with other separate and independent
|
||||
works, which are not by their nature extensions of the covered work,
|
||||
and which are not combined with it such as to form a larger program,
|
||||
in or on a volume of a storage or distribution medium, is called an
|
||||
"aggregate" if the compilation and its resulting copyright are not
|
||||
used to limit the access or legal rights of the compilation's users
|
||||
beyond what the individual works permit. Inclusion of a covered work
|
||||
in an aggregate does not cause this License to apply to the other
|
||||
parts of the aggregate.
|
||||
|
||||
6. Conveying Non-Source Forms.
|
||||
|
||||
You may convey a covered work in object code form under the terms
|
||||
of sections 4 and 5, provided that you also convey the
|
||||
machine-readable Corresponding Source under the terms of this License,
|
||||
in one of these ways:
|
||||
|
||||
a) Convey the object code in, or embodied in, a physical product
|
||||
(including a physical distribution medium), accompanied by the
|
||||
Corresponding Source fixed on a durable physical medium
|
||||
customarily used for software interchange.
|
||||
|
||||
b) Convey the object code in, or embodied in, a physical product
|
||||
(including a physical distribution medium), accompanied by a
|
||||
written offer, valid for at least three years and valid for as
|
||||
long as you offer spare parts or customer support for that product
|
||||
model, to give anyone who possesses the object code either (1) a
|
||||
copy of the Corresponding Source for all the software in the
|
||||
product that is covered by this License, on a durable physical
|
||||
medium customarily used for software interchange, for a price no
|
||||
more than your reasonable cost of physically performing this
|
||||
conveying of source, or (2) access to copy the
|
||||
Corresponding Source from a network server at no charge.
|
||||
|
||||
c) Convey individual copies of the object code with a copy of the
|
||||
written offer to provide the Corresponding Source. This
|
||||
alternative is allowed only occasionally and noncommercially, and
|
||||
only if you received the object code with such an offer, in accord
|
||||
with subsection 6b.
|
||||
|
||||
d) Convey the object code by offering access from a designated
|
||||
place (gratis or for a charge), and offer equivalent access to the
|
||||
Corresponding Source in the same way through the same place at no
|
||||
further charge. You need not require recipients to copy the
|
||||
Corresponding Source along with the object code. If the place to
|
||||
copy the object code is a network server, the Corresponding Source
|
||||
may be on a different server (operated by you or a third party)
|
||||
that supports equivalent copying facilities, provided you maintain
|
||||
clear directions next to the object code saying where to find the
|
||||
Corresponding Source. Regardless of what server hosts the
|
||||
Corresponding Source, you remain obligated to ensure that it is
|
||||
available for as long as needed to satisfy these requirements.
|
||||
|
||||
e) Convey the object code using peer-to-peer transmission, provided
|
||||
you inform other peers where the object code and Corresponding
|
||||
Source of the work are being offered to the general public at no
|
||||
charge under subsection 6d.
|
||||
|
||||
A separable portion of the object code, whose source code is excluded
|
||||
from the Corresponding Source as a System Library, need not be
|
||||
included in conveying the object code work.
|
||||
|
||||
A "User Product" is either (1) a "consumer product", which means any
|
||||
tangible personal property which is normally used for personal, family,
|
||||
or household purposes, or (2) anything designed or sold for incorporation
|
||||
into a dwelling. In determining whether a product is a consumer product,
|
||||
doubtful cases shall be resolved in favor of coverage. For a particular
|
||||
product received by a particular user, "normally used" refers to a
|
||||
typical or common use of that class of product, regardless of the status
|
||||
of the particular user or of the way in which the particular user
|
||||
actually uses, or expects or is expected to use, the product. A product
|
||||
is a consumer product regardless of whether the product has substantial
|
||||
commercial, industrial or non-consumer uses, unless such uses represent
|
||||
the only significant mode of use of the product.
|
||||
|
||||
"Installation Information" for a User Product means any methods,
|
||||
procedures, authorization keys, or other information required to install
|
||||
and execute modified versions of a covered work in that User Product from
|
||||
a modified version of its Corresponding Source. The information must
|
||||
suffice to ensure that the continued functioning of the modified object
|
||||
code is in no case prevented or interfered with solely because
|
||||
modification has been made.
|
||||
|
||||
If you convey an object code work under this section in, or with, or
|
||||
specifically for use in, a User Product, and the conveying occurs as
|
||||
part of a transaction in which the right of possession and use of the
|
||||
User Product is transferred to the recipient in perpetuity or for a
|
||||
fixed term (regardless of how the transaction is characterized), the
|
||||
Corresponding Source conveyed under this section must be accompanied
|
||||
by the Installation Information. But this requirement does not apply
|
||||
if neither you nor any third party retains the ability to install
|
||||
modified object code on the User Product (for example, the work has
|
||||
been installed in ROM).
|
||||
|
||||
The requirement to provide Installation Information does not include a
|
||||
requirement to continue to provide support service, warranty, or updates
|
||||
for a work that has been modified or installed by the recipient, or for
|
||||
the User Product in which it has been modified or installed. Access to a
|
||||
network may be denied when the modification itself materially and
|
||||
adversely affects the operation of the network or violates the rules and
|
||||
protocols for communication across the network.
|
||||
|
||||
Corresponding Source conveyed, and Installation Information provided,
|
||||
in accord with this section must be in a format that is publicly
|
||||
documented (and with an implementation available to the public in
|
||||
source code form), and must require no special password or key for
|
||||
unpacking, reading or copying.
|
||||
|
||||
7. Additional Terms.
|
||||
|
||||
"Additional permissions" are terms that supplement the terms of this
|
||||
License by making exceptions from one or more of its conditions.
|
||||
Additional permissions that are applicable to the entire Program shall
|
||||
be treated as though they were included in this License, to the extent
|
||||
that they are valid under applicable law. If additional permissions
|
||||
apply only to part of the Program, that part may be used separately
|
||||
under those permissions, but the entire Program remains governed by
|
||||
this License without regard to the additional permissions.
|
||||
|
||||
When you convey a copy of a covered work, you may at your option
|
||||
remove any additional permissions from that copy, or from any part of
|
||||
it. (Additional permissions may be written to require their own
|
||||
removal in certain cases when you modify the work.) You may place
|
||||
additional permissions on material, added by you to a covered work,
|
||||
for which you have or can give appropriate copyright permission.
|
||||
|
||||
Notwithstanding any other provision of this License, for material you
|
||||
add to a covered work, you may (if authorized by the copyright holders of
|
||||
that material) supplement the terms of this License with terms:
|
||||
|
||||
a) Disclaiming warranty or limiting liability differently from the
|
||||
terms of sections 15 and 16 of this License; or
|
||||
|
||||
b) Requiring preservation of specified reasonable legal notices or
|
||||
author attributions in that material or in the Appropriate Legal
|
||||
Notices displayed by works containing it; or
|
||||
|
||||
c) Prohibiting misrepresentation of the origin of that material, or
|
||||
requiring that modified versions of such material be marked in
|
||||
reasonable ways as different from the original version; or
|
||||
|
||||
d) Limiting the use for publicity purposes of names of licensors or
|
||||
authors of the material; or
|
||||
|
||||
e) Declining to grant rights under trademark law for use of some
|
||||
trade names, trademarks, or service marks; or
|
||||
|
||||
f) Requiring indemnification of licensors and authors of that
|
||||
material by anyone who conveys the material (or modified versions of
|
||||
it) with contractual assumptions of liability to the recipient, for
|
||||
any liability that these contractual assumptions directly impose on
|
||||
those licensors and authors.
|
||||
|
||||
All other non-permissive additional terms are considered "further
|
||||
restrictions" within the meaning of section 10. If the Program as you
|
||||
received it, or any part of it, contains a notice stating that it is
|
||||
governed by this License along with a term that is a further
|
||||
restriction, you may remove that term. If a license document contains
|
||||
a further restriction but permits relicensing or conveying under this
|
||||
License, you may add to a covered work material governed by the terms
|
||||
of that license document, provided that the further restriction does
|
||||
not survive such relicensing or conveying.
|
||||
|
||||
If you add terms to a covered work in accord with this section, you
|
||||
must place, in the relevant source files, a statement of the
|
||||
additional terms that apply to those files, or a notice indicating
|
||||
where to find the applicable terms.
|
||||
|
||||
Additional terms, permissive or non-permissive, may be stated in the
|
||||
form of a separately written license, or stated as exceptions;
|
||||
the above requirements apply either way.
|
||||
|
||||
8. Termination.
|
||||
|
||||
You may not propagate or modify a covered work except as expressly
|
||||
provided under this License. Any attempt otherwise to propagate or
|
||||
modify it is void, and will automatically terminate your rights under
|
||||
this License (including any patent licenses granted under the third
|
||||
paragraph of section 11).
|
||||
|
||||
However, if you cease all violation of this License, then your
|
||||
license from a particular copyright holder is reinstated (a)
|
||||
provisionally, unless and until the copyright holder explicitly and
|
||||
finally terminates your license, and (b) permanently, if the copyright
|
||||
holder fails to notify you of the violation by some reasonable means
|
||||
prior to 60 days after the cessation.
|
||||
|
||||
Moreover, your license from a particular copyright holder is
|
||||
reinstated permanently if the copyright holder notifies you of the
|
||||
violation by some reasonable means, this is the first time you have
|
||||
received notice of violation of this License (for any work) from that
|
||||
copyright holder, and you cure the violation prior to 30 days after
|
||||
your receipt of the notice.
|
||||
|
||||
Termination of your rights under this section does not terminate the
|
||||
licenses of parties who have received copies or rights from you under
|
||||
this License. If your rights have been terminated and not permanently
|
||||
reinstated, you do not qualify to receive new licenses for the same
|
||||
material under section 10.
|
||||
|
||||
9. Acceptance Not Required for Having Copies.
|
||||
|
||||
You are not required to accept this License in order to receive or
|
||||
run a copy of the Program. Ancillary propagation of a covered work
|
||||
occurring solely as a consequence of using peer-to-peer transmission
|
||||
to receive a copy likewise does not require acceptance. However,
|
||||
nothing other than this License grants you permission to propagate or
|
||||
modify any covered work. These actions infringe copyright if you do
|
||||
not accept this License. Therefore, by modifying or propagating a
|
||||
covered work, you indicate your acceptance of this License to do so.
|
||||
|
||||
10. Automatic Licensing of Downstream Recipients.
|
||||
|
||||
Each time you convey a covered work, the recipient automatically
|
||||
receives a license from the original licensors, to run, modify and
|
||||
propagate that work, subject to this License. You are not responsible
|
||||
for enforcing compliance by third parties with this License.
|
||||
|
||||
An "entity transaction" is a transaction transferring control of an
|
||||
organization, or substantially all assets of one, or subdividing an
|
||||
organization, or merging organizations. If propagation of a covered
|
||||
work results from an entity transaction, each party to that
|
||||
transaction who receives a copy of the work also receives whatever
|
||||
licenses to the work the party's predecessor in interest had or could
|
||||
give under the previous paragraph, plus a right to possession of the
|
||||
Corresponding Source of the work from the predecessor in interest, if
|
||||
the predecessor has it or can get it with reasonable efforts.
|
||||
|
||||
You may not impose any further restrictions on the exercise of the
|
||||
rights granted or affirmed under this License. For example, you may
|
||||
not impose a license fee, royalty, or other charge for exercise of
|
||||
rights granted under this License, and you may not initiate litigation
|
||||
(including a cross-claim or counterclaim in a lawsuit) alleging that
|
||||
any patent claim is infringed by making, using, selling, offering for
|
||||
sale, or importing the Program or any portion of it.
|
||||
|
||||
11. Patents.
|
||||
|
||||
A "contributor" is a copyright holder who authorizes use under this
|
||||
License of the Program or a work on which the Program is based. The
|
||||
work thus licensed is called the contributor's "contributor version".
|
||||
|
||||
A contributor's "essential patent claims" are all patent claims
|
||||
owned or controlled by the contributor, whether already acquired or
|
||||
hereafter acquired, that would be infringed by some manner, permitted
|
||||
by this License, of making, using, or selling its contributor version,
|
||||
but do not include claims that would be infringed only as a
|
||||
consequence of further modification of the contributor version. For
|
||||
purposes of this definition, "control" includes the right to grant
|
||||
patent sublicenses in a manner consistent with the requirements of
|
||||
this License.
|
||||
|
||||
Each contributor grants you a non-exclusive, worldwide, royalty-free
|
||||
patent license under the contributor's essential patent claims, to
|
||||
make, use, sell, offer for sale, import and otherwise run, modify and
|
||||
propagate the contents of its contributor version.
|
||||
|
||||
In the following three paragraphs, a "patent license" is any express
|
||||
agreement or commitment, however denominated, not to enforce a patent
|
||||
(such as an express permission to practice a patent or covenant not to
|
||||
sue for patent infringement). To "grant" such a patent license to a
|
||||
party means to make such an agreement or commitment not to enforce a
|
||||
patent against the party.
|
||||
|
||||
If you convey a covered work, knowingly relying on a patent license,
|
||||
and the Corresponding Source of the work is not available for anyone
|
||||
to copy, free of charge and under the terms of this License, through a
|
||||
publicly available network server or other readily accessible means,
|
||||
then you must either (1) cause the Corresponding Source to be so
|
||||
available, or (2) arrange to deprive yourself of the benefit of the
|
||||
patent license for this particular work, or (3) arrange, in a manner
|
||||
consistent with the requirements of this License, to extend the patent
|
||||
license to downstream recipients. "Knowingly relying" means you have
|
||||
actual knowledge that, but for the patent license, your conveying the
|
||||
covered work in a country, or your recipient's use of the covered work
|
||||
in a country, would infringe one or more identifiable patents in that
|
||||
country that you have reason to believe are valid.
|
||||
|
||||
If, pursuant to or in connection with a single transaction or
|
||||
arrangement, you convey, or propagate by procuring conveyance of, a
|
||||
covered work, and grant a patent license to some of the parties
|
||||
receiving the covered work authorizing them to use, propagate, modify
|
||||
or convey a specific copy of the covered work, then the patent license
|
||||
you grant is automatically extended to all recipients of the covered
|
||||
work and works based on it.
|
||||
|
||||
A patent license is "discriminatory" if it does not include within
|
||||
the scope of its coverage, prohibits the exercise of, or is
|
||||
conditioned on the non-exercise of one or more of the rights that are
|
||||
specifically granted under this License. You may not convey a covered
|
||||
work if you are a party to an arrangement with a third party that is
|
||||
in the business of distributing software, under which you make payment
|
||||
to the third party based on the extent of your activity of conveying
|
||||
the work, and under which the third party grants, to any of the
|
||||
parties who would receive the covered work from you, a discriminatory
|
||||
patent license (a) in connection with copies of the covered work
|
||||
conveyed by you (or copies made from those copies), or (b) primarily
|
||||
for and in connection with specific products or compilations that
|
||||
contain the covered work, unless you entered into that arrangement,
|
||||
or that patent license was granted, prior to 28 March 2007.
|
||||
|
||||
Nothing in this License shall be construed as excluding or limiting
|
||||
any implied license or other defenses to infringement that may
|
||||
otherwise be available to you under applicable patent law.
|
||||
|
||||
12. No Surrender of Others' Freedom.
|
||||
|
||||
If conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot convey a
|
||||
covered work so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you may
|
||||
not convey it at all. For example, if you agree to terms that obligate you
|
||||
to collect a royalty for further conveying from those to whom you convey
|
||||
the Program, the only way you could satisfy both those terms and this
|
||||
License would be to refrain entirely from conveying the Program.
|
||||
|
||||
13. Use with the GNU Affero General Public License.
|
||||
|
||||
Notwithstanding any other provision of this License, you have
|
||||
permission to link or combine any covered work with a work licensed
|
||||
under version 3 of the GNU Affero General Public License into a single
|
||||
combined work, and to convey the resulting work. The terms of this
|
||||
License will continue to apply to the part which is the covered work,
|
||||
but the special requirements of the GNU Affero General Public License,
|
||||
section 13, concerning interaction through a network will apply to the
|
||||
combination as such.
|
||||
|
||||
14. Revised Versions of this License.
|
||||
|
||||
The Free Software Foundation may publish revised and/or new versions of
|
||||
the GNU General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the
|
||||
Program specifies that a certain numbered version of the GNU General
|
||||
Public License "or any later version" applies to it, you have the
|
||||
option of following the terms and conditions either of that numbered
|
||||
version or of any later version published by the Free Software
|
||||
Foundation. If the Program does not specify a version number of the
|
||||
GNU General Public License, you may choose any version ever published
|
||||
by the Free Software Foundation.
|
||||
|
||||
If the Program specifies that a proxy can decide which future
|
||||
versions of the GNU General Public License can be used, that proxy's
|
||||
public statement of acceptance of a version permanently authorizes you
|
||||
to choose that version for the Program.
|
||||
|
||||
Later license versions may give you additional or different
|
||||
permissions. However, no additional obligations are imposed on any
|
||||
author or copyright holder as a result of your choosing to follow a
|
||||
later version.
|
||||
|
||||
15. Disclaimer of Warranty.
|
||||
|
||||
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
|
||||
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
|
||||
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
|
||||
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
|
||||
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
|
||||
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
|
||||
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
|
||||
|
||||
16. Limitation of Liability.
|
||||
|
||||
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
|
||||
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
|
||||
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
|
||||
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
|
||||
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
|
||||
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
|
||||
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
|
||||
SUCH DAMAGES.
|
||||
|
||||
17. Interpretation of Sections 15 and 16.
|
||||
|
||||
If the disclaimer of warranty and limitation of liability provided
|
||||
above cannot be given local legal effect according to their terms,
|
||||
reviewing courts shall apply local law that most closely approximates
|
||||
an absolute waiver of all civil liability in connection with the
|
||||
Program, unless a warranty or assumption of liability accompanies a
|
||||
copy of the Program in return for a fee.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
How to Apply These Terms to Your New Programs
|
||||
|
||||
If you develop a new program, and you want it to be of the greatest
|
||||
possible use to the public, the best way to achieve this is to make it
|
||||
free software which everyone can redistribute and change under these terms.
|
||||
|
||||
To do so, attach the following notices to the program. It is safest
|
||||
to attach them to the start of each source file to most effectively
|
||||
state the exclusion of warranty; and each file should have at least
|
||||
the "copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the program's name and a brief idea of what it does.>
|
||||
Copyright (C) <year> <name of author>
|
||||
|
||||
This program is free software: you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation, either version 3 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
If the program does terminal interaction, make it output a short
|
||||
notice like this when it starts in an interactive mode:
|
||||
|
||||
<program> Copyright (C) <year> <name of author>
|
||||
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||
This is free software, and you are welcome to redistribute it
|
||||
under certain conditions; type `show c' for details.
|
||||
|
||||
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||
parts of the General Public License. Of course, your program's commands
|
||||
might be different; for a GUI interface, you would use an "about box".
|
||||
|
||||
You should also get your employer (if you work as a programmer) or school,
|
||||
if any, to sign a "copyright disclaimer" for the program, if necessary.
|
||||
For more information on this, and how to apply and follow the GNU GPL, see
|
||||
<https://www.gnu.org/licenses/>.
|
||||
|
||||
The GNU General Public License does not permit incorporating your program
|
||||
into proprietary programs. If your program is a subroutine library, you
|
||||
may consider it more useful to permit linking proprietary applications with
|
||||
the library. If this is what you want to do, use the GNU Lesser General
|
||||
Public License instead of this License. But first, please read
|
||||
<https://www.gnu.org/licenses/why-not-lgpl.html>.
|
194
yolov7-tracker-example/README.md
Normal file
194
yolov7-tracker-example/README.md
Normal file
@ -0,0 +1,194 @@
|
||||
# YOLO detector and SOTA Multi-object tracker Toolbox
|
||||
|
||||
## ❗❗Important Notes
|
||||
|
||||
Compared to the previous version, this is an ***entirely new version (branch v2)***!!!
|
||||
|
||||
**Please use this version directly, as I have almost rewritten all the code to ensure better readability and improved results, as well as to correct some errors in the past code.**
|
||||
|
||||
```bash
|
||||
git clone https://github.com/JackWoo0831/Yolov7-tracker.git
|
||||
git checkout v2 # change to v2 branch !!
|
||||
```
|
||||
|
||||
🙌 ***If you have any suggestions for adding trackers***, please leave a comment in the Issues section with the paper title or link! Everyone is welcome to contribute to making this repo better.
|
||||
|
||||
<div align="center">
|
||||
|
||||
**Language**: English | [简体中文](README_CN.md)
|
||||
|
||||
</div>
|
||||
|
||||
## ❤️ Introduction
|
||||
|
||||
This repo is a toolbox that implements the **tracking-by-detection paradigm multi-object tracker**. The detector supports:
|
||||
|
||||
- YOLOX
|
||||
- YOLO v7
|
||||
- YOLO v8,
|
||||
|
||||
and the tracker supports:
|
||||
|
||||
- SORT
|
||||
- DeepSORT
|
||||
- ByteTrack ([ECCV2022](https://arxiv.org/pdf/2110.06864))
|
||||
- Bot-SORT ([arxiv2206](https://arxiv.org/pdf/2206.14651.pdf))
|
||||
- OCSORT ([CVPR2023](https://openaccess.thecvf.com/content/CVPR2023/papers/Cao_Observation-Centric_SORT_Rethinking_SORT_for_Robust_Multi-Object_Tracking_CVPR_2023_paper.pdf))
|
||||
- C_BIoU Track ([arxiv2211](https://arxiv.org/pdf/2211.14317v2.pdf))
|
||||
- Strong SORT ([IEEE TMM 2023](https://arxiv.org/pdf/2202.13514))
|
||||
- Sparse Track ([arxiv 2306](https://arxiv.org/pdf/2306.05238))
|
||||
|
||||
and the reid model supports:
|
||||
|
||||
- OSNet
|
||||
- Extractor from DeepSort
|
||||
|
||||
The highlights are:
|
||||
- Supporting more trackers than MMTracking
|
||||
- Rewrite multiple trackers with a ***unified code style***, without the need to configure multiple environments for each tracker
|
||||
- Modular design, which ***decouples*** the detector, tracker, reid model and Kalman filter for easy conducting experiments
|
||||
|
||||

|
||||
|
||||
## 🗺️ Roadmap
|
||||
|
||||
- [ x ] Add StrongSort and SparseTrack
|
||||
- [ x ] Add save video function
|
||||
- [ x ] Add timer function to calculate fps
|
||||
- [] Add more ReID modules.
|
||||
|
||||
## 🔨 Installation
|
||||
|
||||
The basic env is:
|
||||
- Ubuntu 18.04
|
||||
- Python:3.9, Pytorch: 1.12
|
||||
|
||||
Run following commond to install other packages:
|
||||
|
||||
```bash
|
||||
pip3 install -r requirements.txt
|
||||
```
|
||||
|
||||
### 🔍 Detector installation
|
||||
|
||||
1. YOLOX:
|
||||
|
||||
The version of YOLOX is **0.1.0 (same as ByteTrack)**. To install it, you can clone the ByteTrack repo somewhere, and run:
|
||||
|
||||
``` bash
|
||||
https://github.com/ifzhang/ByteTrack.git
|
||||
|
||||
python3 setup.py develop
|
||||
```
|
||||
|
||||
2. YOLO v7:
|
||||
|
||||
There is no need to execute addtional steps as the repo itself is based on YOLOv7.
|
||||
|
||||
3. YOLO v8:
|
||||
|
||||
Please run:
|
||||
|
||||
```bash
|
||||
pip3 install ultralytics==8.0.94
|
||||
```
|
||||
|
||||
### 📑 Data preparation
|
||||
|
||||
***If you do not want to test on the specific dataset, instead, you only want to run demos, please skip this section.***
|
||||
|
||||
***No matter what dataset you want to test, please organize it in the following way (YOLO style):***
|
||||
|
||||
```
|
||||
dataset_name
|
||||
|---images
|
||||
|---train
|
||||
|---sequence_name1
|
||||
|---000001.jpg
|
||||
|---000002.jpg ...
|
||||
|---val ...
|
||||
|---test ...
|
||||
|
||||
|
|
||||
|
||||
```
|
||||
|
||||
You can refer to the codes in `./tools` to see how to organize the datasets.
|
||||
|
||||
***Then, you need to prepare a `yaml` file to indicate the path so that the code can find the images.***
|
||||
|
||||
Some examples are in `tracker/config_files`. The important keys are:
|
||||
|
||||
```
|
||||
DATASET_ROOT: '/data/xxxx/datasets/MOT17' # your dataset root
|
||||
SPLIT: test # train, test or val
|
||||
CATEGORY_NAMES: # same in YOLO training
|
||||
- 'pedestrian'
|
||||
|
||||
CATEGORY_DICT:
|
||||
0: 'pedestrian'
|
||||
```
|
||||
|
||||
|
||||
|
||||
## 🚗 Practice
|
||||
|
||||
### 🏃 Training
|
||||
|
||||
Trackers generally do not require parameters to be trained. Please refer to the training methods of different detectors to train YOLOs.
|
||||
|
||||
Some references may help you:
|
||||
|
||||
- YOLOX: `tracker/yolox_utils/train_yolox.py`
|
||||
|
||||
- YOLO v7:
|
||||
|
||||
```shell
|
||||
python train_aux.py --dataset visdrone --workers 8 --device <$GPU_id$> --batch-size 16 --data data/visdrone_all.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights <$YOLO v7 pretrained model path$> --name yolov7-w6-custom --hyp data/hyp.scratch.custom.yaml
|
||||
```
|
||||
|
||||
- YOLO v8: `tracker/yolov8_utils/train_yolov8.py`
|
||||
|
||||
|
||||
|
||||
### 😊 Tracking !
|
||||
|
||||
If you only want to run a demo:
|
||||
|
||||
```bash
|
||||
python tracker/track_demo.py --obj ${video path or images folder path} --detector ${yolox, yolov8 or yolov7} --tracker ${tracker name} --kalman_format ${kalman format, sort, byte, ...} --detector_model_path ${detector weight path} --save_images
|
||||
```
|
||||
|
||||
For example:
|
||||
|
||||
```bash
|
||||
python tracker/track_demo.py --obj M0203.mp4 --detector yolov8 --tracker deepsort --kalman_format byte --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt --save_images
|
||||
```
|
||||
|
||||
If you want to run trackers on dataset:
|
||||
|
||||
```bash
|
||||
python tracker/track.py --dataset ${dataset name, related with the yaml file} --detector ${yolox, yolov8 or yolov7} --tracker ${tracker name} --kalman_format ${kalman format, sort, byte, ...} --detector_model_path ${detector weight path}
|
||||
```
|
||||
|
||||
For example:
|
||||
|
||||
- SORT: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker sort --kalman_format sort --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt `
|
||||
|
||||
- DeepSORT: `python tracker/track.py --dataset uavdt --detector yolov7 --tracker deepsort --kalman_format byte --detector_model_path weights/yolov7_UAVDT_35epochs_20230507.pt`
|
||||
|
||||
- ByteTrack: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker bytetrack --kalman_format byte --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt`
|
||||
|
||||
- OCSort: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker ocsort --kalman_format ocsort --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt`
|
||||
|
||||
- C-BIoU Track: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker c_bioutrack --kalman_format bot --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt`
|
||||
|
||||
- BoT-SORT: `python tracker/track.py --dataset uavdt --detector yolox --tracker botsort --kalman_format bot --detector_model_path weights/yolox_m_uavdt_50epochs.pth.tar`
|
||||
|
||||
- Strong SORT: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker strongsort --kalman_format strongsort --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt`
|
||||
|
||||
- Sparse Track: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker sparsetrack --kalman_format bot --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt`
|
||||
|
||||
### ✅ Evaluation
|
||||
|
||||
Coming Soon. As an alternative, after obtaining the result txt file, you can use the [Easier to use TrackEval repo](https://github.com/JackWoo0831/Easier_To_Use_TrackEval).
|
186
yolov7-tracker-example/README_CN.md
Normal file
186
yolov7-tracker-example/README_CN.md
Normal file
@ -0,0 +1,186 @@
|
||||
# YOLO检测器与SOTA多目标跟踪工具箱
|
||||
|
||||
## ❗❗重要提示
|
||||
|
||||
与之前的版本相比,这是一个***全新的版本(分支v2)***!!!
|
||||
|
||||
**请直接使用这个版本,因为我几乎重写了所有代码,以确保更好的可读性和改进的结果,并修正了以往代码中的一些错误。**
|
||||
|
||||
```bash
|
||||
git clone https://github.com/JackWoo0831/Yolov7-tracker.git
|
||||
git checkout v2 # change to v2 branch !!
|
||||
```
|
||||
|
||||
🙌 ***如果您有任何关于添加跟踪器的建议***,请在Issues部分留言并附上论文标题或链接!欢迎大家一起来让这个repo变得更好
|
||||
|
||||
|
||||
|
||||
## ❤️ 介绍
|
||||
|
||||
这个仓库是一个实现了***检测后跟踪范式***多目标跟踪器的工具箱。检测器支持:
|
||||
|
||||
- YOLOX
|
||||
- YOLO v7
|
||||
- YOLO v8,
|
||||
|
||||
跟踪器支持:
|
||||
|
||||
- SORT
|
||||
- DeepSORT
|
||||
- ByteTrack ([ECCV2022](https://arxiv.org/pdf/2110.06864))
|
||||
- Bot-SORT ([arxiv2206](https://arxiv.org/pdf/2206.14651.pdf))
|
||||
- OCSORT ([CVPR2023](https://openaccess.thecvf.com/content/CVPR2023/papers/Cao_Observation-Centric_SORT_Rethinking_SORT_for_Robust_Multi-Object_Tracking_CVPR_2023_paper.pdf))
|
||||
- C_BIoU Track ([arxiv2211](https://arxiv.org/pdf/2211.14317v2.pdf))
|
||||
- Strong SORT ([IEEE TMM 2023](https://arxiv.org/pdf/2202.13514))
|
||||
- Sparse Track ([arxiv 2306](https://arxiv.org/pdf/2306.05238))
|
||||
|
||||
REID模型支持:
|
||||
|
||||
- OSNet
|
||||
- DeepSORT中的
|
||||
|
||||
亮点包括:
|
||||
- 支持的跟踪器比MMTracking多
|
||||
- 用***统一的代码风格***重写了多个跟踪器,无需为每个跟踪器配置多个环境
|
||||
- 模块化设计,将检测器、跟踪器、外观提取模块和卡尔曼滤波器**解耦**,便于进行实验
|
||||
|
||||

|
||||
|
||||
## 🗺️ 路线图
|
||||
|
||||
- [ x ] Add StrongSort and SparseTrack
|
||||
- [ x ] Add save video function
|
||||
- [ x ] Add timer function to calculate fps
|
||||
- [] Add more ReID modules.mer function to calculate fps
|
||||
|
||||
## 🔨 安装
|
||||
|
||||
基本环境是:
|
||||
- Ubuntu 18.04
|
||||
- Python:3.9, Pytorch: 1.12
|
||||
|
||||
运行以下命令安装其他包:
|
||||
|
||||
```bash
|
||||
pip3 install -r requirements.txt
|
||||
```
|
||||
|
||||
### 🔍 检测器安装
|
||||
|
||||
1. YOLOX:
|
||||
|
||||
YOLOX的版本是0.1.0(与ByteTrack相同)。要安装它,你可以在某处克隆ByteTrack仓库,然后运行:
|
||||
|
||||
``` bash
|
||||
https://github.com/ifzhang/ByteTrack.git
|
||||
|
||||
python3 setup.py develop
|
||||
```
|
||||
|
||||
2. YOLO v7:
|
||||
|
||||
由于仓库本身就是基于YOLOv7的,因此无需执行额外的步骤。
|
||||
|
||||
3. YOLO v8:
|
||||
|
||||
请运行:
|
||||
|
||||
```bash
|
||||
pip3 install ultralytics==8.0.94
|
||||
```
|
||||
|
||||
### 📑 数据准备
|
||||
|
||||
***如果你不想在特定数据集上测试,而只想运行演示,请跳过这一部分。***
|
||||
|
||||
***无论你想测试哪个数据集,请按以下方式(YOLO风格)组织:***
|
||||
|
||||
```
|
||||
dataset_name
|
||||
|---images
|
||||
|---train
|
||||
|---sequence_name1
|
||||
|---000001.jpg
|
||||
|---000002.jpg ...
|
||||
|---val ...
|
||||
|---test ...
|
||||
|
||||
|
|
||||
|
||||
```
|
||||
|
||||
你可以参考`./tools`中的代码来了解如何组织数据集。
|
||||
|
||||
***然后,你需要准备一个yaml文件来指明路径,以便代码能够找到图像***
|
||||
|
||||
一些示例在tracker/config_files中。重要的键包括:
|
||||
|
||||
```
|
||||
DATASET_ROOT: '/data/xxxx/datasets/MOT17' # your dataset root
|
||||
SPLIT: test # train, test or val
|
||||
CATEGORY_NAMES: # same in YOLO training
|
||||
- 'pedestrian'
|
||||
|
||||
CATEGORY_DICT:
|
||||
0: 'pedestrian'
|
||||
```
|
||||
|
||||
|
||||
|
||||
## 🚗 实践
|
||||
|
||||
### 🏃 训练
|
||||
|
||||
跟踪器通常不需要训练参数。请参考不同检测器的训练方法来训练YOLOs。
|
||||
|
||||
以下参考资料可能对你有帮助:
|
||||
|
||||
- YOLOX: `tracker/yolox_utils/train_yolox.py`
|
||||
|
||||
- YOLO v7:
|
||||
|
||||
```shell
|
||||
python train_aux.py --dataset visdrone --workers 8 --device <$GPU_id$> --batch-size 16 --data data/visdrone_all.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights <$YOLO v7 pretrained model path$> --name yolov7-w6-custom --hyp data/hyp.scratch.custom.yaml
|
||||
```
|
||||
|
||||
- YOLO v8: `tracker/yolov8_utils/train_yolov8.py`
|
||||
|
||||
|
||||
|
||||
### 😊 跟踪!
|
||||
|
||||
如果你只是想运行一个demo:
|
||||
|
||||
```bash
|
||||
python tracker/track_demo.py --obj ${video path or images folder path} --detector ${yolox, yolov8 or yolov7} --tracker ${tracker name} --kalman_format ${kalman format, sort, byte, ...} --detector_model_path ${detector weight path} --save_images
|
||||
```
|
||||
|
||||
例如:
|
||||
|
||||
```bash
|
||||
python tracker/track_demo.py --obj M0203.mp4 --detector yolov8 --tracker deepsort --kalman_format byte --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt --save_images
|
||||
```
|
||||
|
||||
如果你想在数据集上测试:
|
||||
|
||||
```bash
|
||||
python tracker/track.py --dataset ${dataset name, related with the yaml file} --detector ${yolox, yolov8 or yolov7} --tracker ${tracker name} --kalman_format ${kalman format, sort, byte, ...} --detector_model_path ${detector weight path}
|
||||
```
|
||||
|
||||
例如:
|
||||
|
||||
- SORT: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker sort --kalman_format sort --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt `
|
||||
|
||||
- DeepSORT: `python tracker/track.py --dataset uavdt --detector yolov7 --tracker deepsort --kalman_format byte --detector_model_path weights/yolov7_UAVDT_35epochs_20230507.pt`
|
||||
|
||||
- ByteTrack: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker bytetrack --kalman_format byte --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt`
|
||||
|
||||
- OCSort: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker ocsort --kalman_format ocsort --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt`
|
||||
|
||||
- C-BIoU Track: `python tracker/track.py --dataset uavdt --detector yolov8 --tracker c_bioutrack --kalman_format bot --detector_model_path weights/yolov8l_UAVDT_60epochs_20230509.pt`
|
||||
|
||||
- BoT-SORT: `python tracker/track.py --dataset uavdt --detector yolox --tracker botsort --kalman_format bot --detector_model_path weights/yolox_m_uavdt_50epochs.pth.tar`
|
||||
|
||||
### ✅ 评估
|
||||
|
||||
马上推出!作为备选项,你可以使用这个repo: [Easier to use TrackEval repo](https://github.com/JackWoo0831/Easier_To_Use_TrackEval).
|
49
yolov7-tracker-example/cfg/baseline/r50-csp.yaml
Normal file
49
yolov7-tracker-example/cfg/baseline/r50-csp.yaml
Normal file
@ -0,0 +1,49 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# CSP-ResNet backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Stem, [128]], # 0-P1/2
|
||||
[-1, 3, ResCSPC, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 2-P3/8
|
||||
[-1, 4, ResCSPC, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 4-P3/8
|
||||
[-1, 6, ResCSPC, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 6-P3/8
|
||||
[-1, 3, ResCSPC, [1024]], # 7
|
||||
]
|
||||
|
||||
# CSP-Res-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 8
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[5, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, ResCSPB, [256]], # 13
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[3, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, ResCSPB, [128]], # 18
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 13], 1, Concat, [1]], # cat
|
||||
[-1, 2, ResCSPB, [256]], # 22
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 8], 1, Concat, [1]], # cat
|
||||
[-1, 2, ResCSPB, [512]], # 26
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[19,23,27], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
49
yolov7-tracker-example/cfg/baseline/x50-csp.yaml
Normal file
49
yolov7-tracker-example/cfg/baseline/x50-csp.yaml
Normal file
@ -0,0 +1,49 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# CSP-ResNeXt backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Stem, [128]], # 0-P1/2
|
||||
[-1, 3, ResXCSPC, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 2-P3/8
|
||||
[-1, 4, ResXCSPC, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 4-P3/8
|
||||
[-1, 6, ResXCSPC, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 6-P3/8
|
||||
[-1, 3, ResXCSPC, [1024]], # 7
|
||||
]
|
||||
|
||||
# CSP-ResX-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 8
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[5, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, ResXCSPB, [256]], # 13
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[3, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, ResXCSPB, [128]], # 18
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 13], 1, Concat, [1]], # cat
|
||||
[-1, 2, ResXCSPB, [256]], # 22
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 8], 1, Concat, [1]], # cat
|
||||
[-1, 2, ResXCSPB, [512]], # 26
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[19,23,27], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
52
yolov7-tracker-example/cfg/baseline/yolor-csp-x.yaml
Normal file
52
yolov7-tracker-example/cfg/baseline/yolor-csp-x.yaml
Normal file
@ -0,0 +1,52 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.33 # model depth multiple
|
||||
width_multiple: 1.25 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# CSP-Darknet backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [32, 3, 1]], # 0
|
||||
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Bottleneck, [64]],
|
||||
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
|
||||
[-1, 2, BottleneckCSPC, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 5-P3/8
|
||||
[-1, 8, BottleneckCSPC, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 7-P4/16
|
||||
[-1, 8, BottleneckCSPC, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
|
||||
[-1, 4, BottleneckCSPC, [1024]], # 10
|
||||
]
|
||||
|
||||
# CSP-Dark-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 11
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[8, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, BottleneckCSPB, [256]], # 16
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[6, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, BottleneckCSPB, [128]], # 21
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 16], 1, Concat, [1]], # cat
|
||||
[-1, 2, BottleneckCSPB, [256]], # 25
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 11], 1, Concat, [1]], # cat
|
||||
[-1, 2, BottleneckCSPB, [512]], # 29
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[22,26,30], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
52
yolov7-tracker-example/cfg/baseline/yolor-csp.yaml
Normal file
52
yolov7-tracker-example/cfg/baseline/yolor-csp.yaml
Normal file
@ -0,0 +1,52 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# CSP-Darknet backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [32, 3, 1]], # 0
|
||||
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Bottleneck, [64]],
|
||||
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
|
||||
[-1, 2, BottleneckCSPC, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 5-P3/8
|
||||
[-1, 8, BottleneckCSPC, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 7-P4/16
|
||||
[-1, 8, BottleneckCSPC, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
|
||||
[-1, 4, BottleneckCSPC, [1024]], # 10
|
||||
]
|
||||
|
||||
# CSP-Dark-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 11
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[8, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, BottleneckCSPB, [256]], # 16
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[6, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, BottleneckCSPB, [128]], # 21
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 16], 1, Concat, [1]], # cat
|
||||
[-1, 2, BottleneckCSPB, [256]], # 25
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 11], 1, Concat, [1]], # cat
|
||||
[-1, 2, BottleneckCSPB, [512]], # 29
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[22,26,30], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
63
yolov7-tracker-example/cfg/baseline/yolor-d6.yaml
Normal file
63
yolov7-tracker-example/cfg/baseline/yolor-d6.yaml
Normal file
@ -0,0 +1,63 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # expand model depth
|
||||
width_multiple: 1.25 # expand layer channels
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# CSP-Darknet backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [64, 3, 1]], # 1-P1/2
|
||||
[-1, 1, DownC, [128]], # 2-P2/4
|
||||
[-1, 3, BottleneckCSPA, [128]],
|
||||
[-1, 1, DownC, [256]], # 4-P3/8
|
||||
[-1, 15, BottleneckCSPA, [256]],
|
||||
[-1, 1, DownC, [512]], # 6-P4/16
|
||||
[-1, 15, BottleneckCSPA, [512]],
|
||||
[-1, 1, DownC, [768]], # 8-P5/32
|
||||
[-1, 7, BottleneckCSPA, [768]],
|
||||
[-1, 1, DownC, [1024]], # 10-P6/64
|
||||
[-1, 7, BottleneckCSPA, [1024]], # 11
|
||||
]
|
||||
|
||||
# CSP-Dark-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 12
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-6, 1, Conv, [384, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [384]], # 17
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-13, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [256]], # 22
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-20, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [128]], # 27
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, DownC, [256]],
|
||||
[[-1, 22], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [256]], # 31
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, DownC, [384]],
|
||||
[[-1, 17], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [384]], # 35
|
||||
[-1, 1, Conv, [768, 3, 1]],
|
||||
[-2, 1, DownC, [512]],
|
||||
[[-1, 12], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [512]], # 39
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[28,32,36,40], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
63
yolov7-tracker-example/cfg/baseline/yolor-e6.yaml
Normal file
63
yolov7-tracker-example/cfg/baseline/yolor-e6.yaml
Normal file
@ -0,0 +1,63 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # expand model depth
|
||||
width_multiple: 1.25 # expand layer channels
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# CSP-Darknet backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [64, 3, 1]], # 1-P1/2
|
||||
[-1, 1, DownC, [128]], # 2-P2/4
|
||||
[-1, 3, BottleneckCSPA, [128]],
|
||||
[-1, 1, DownC, [256]], # 4-P3/8
|
||||
[-1, 7, BottleneckCSPA, [256]],
|
||||
[-1, 1, DownC, [512]], # 6-P4/16
|
||||
[-1, 7, BottleneckCSPA, [512]],
|
||||
[-1, 1, DownC, [768]], # 8-P5/32
|
||||
[-1, 3, BottleneckCSPA, [768]],
|
||||
[-1, 1, DownC, [1024]], # 10-P6/64
|
||||
[-1, 3, BottleneckCSPA, [1024]], # 11
|
||||
]
|
||||
|
||||
# CSP-Dark-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 12
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-6, 1, Conv, [384, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [384]], # 17
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-13, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [256]], # 22
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-20, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [128]], # 27
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, DownC, [256]],
|
||||
[[-1, 22], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [256]], # 31
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, DownC, [384]],
|
||||
[[-1, 17], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [384]], # 35
|
||||
[-1, 1, Conv, [768, 3, 1]],
|
||||
[-2, 1, DownC, [512]],
|
||||
[[-1, 12], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [512]], # 39
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[28,32,36,40], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
63
yolov7-tracker-example/cfg/baseline/yolor-p6.yaml
Normal file
63
yolov7-tracker-example/cfg/baseline/yolor-p6.yaml
Normal file
@ -0,0 +1,63 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # expand model depth
|
||||
width_multiple: 1.0 # expand layer channels
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# CSP-Darknet backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [64, 3, 1]], # 1-P1/2
|
||||
[-1, 1, Conv, [128, 3, 2]], # 2-P2/4
|
||||
[-1, 3, BottleneckCSPA, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 4-P3/8
|
||||
[-1, 7, BottleneckCSPA, [256]],
|
||||
[-1, 1, Conv, [384, 3, 2]], # 6-P4/16
|
||||
[-1, 7, BottleneckCSPA, [384]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 8-P5/32
|
||||
[-1, 3, BottleneckCSPA, [512]],
|
||||
[-1, 1, Conv, [640, 3, 2]], # 10-P6/64
|
||||
[-1, 3, BottleneckCSPA, [640]], # 11
|
||||
]
|
||||
|
||||
# CSP-Dark-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [320]], # 12
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-6, 1, Conv, [256, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [256]], # 17
|
||||
[-1, 1, Conv, [192, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-13, 1, Conv, [192, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [192]], # 22
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-20, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [128]], # 27
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, Conv, [192, 3, 2]],
|
||||
[[-1, 22], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [192]], # 31
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-2, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 17], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [256]], # 35
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, Conv, [320, 3, 2]],
|
||||
[[-1, 12], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [320]], # 39
|
||||
[-1, 1, Conv, [640, 3, 1]],
|
||||
|
||||
[[28,32,36,40], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
63
yolov7-tracker-example/cfg/baseline/yolor-w6.yaml
Normal file
63
yolov7-tracker-example/cfg/baseline/yolor-w6.yaml
Normal file
@ -0,0 +1,63 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # expand model depth
|
||||
width_multiple: 1.0 # expand layer channels
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# CSP-Darknet backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [64, 3, 1]], # 1-P1/2
|
||||
[-1, 1, Conv, [128, 3, 2]], # 2-P2/4
|
||||
[-1, 3, BottleneckCSPA, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 4-P3/8
|
||||
[-1, 7, BottleneckCSPA, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 6-P4/16
|
||||
[-1, 7, BottleneckCSPA, [512]],
|
||||
[-1, 1, Conv, [768, 3, 2]], # 8-P5/32
|
||||
[-1, 3, BottleneckCSPA, [768]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 10-P6/64
|
||||
[-1, 3, BottleneckCSPA, [1024]], # 11
|
||||
]
|
||||
|
||||
# CSP-Dark-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 12
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-6, 1, Conv, [384, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [384]], # 17
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-13, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [256]], # 22
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[-20, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 3, BottleneckCSPB, [128]], # 27
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 22], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [256]], # 31
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, Conv, [384, 3, 2]],
|
||||
[[-1, 17], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [384]], # 35
|
||||
[-1, 1, Conv, [768, 3, 1]],
|
||||
[-2, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 12], 1, Concat, [1]], # cat
|
||||
[-1, 3, BottleneckCSPB, [512]], # 39
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[28,32,36,40], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
51
yolov7-tracker-example/cfg/baseline/yolov3-spp.yaml
Normal file
51
yolov7-tracker-example/cfg/baseline/yolov3-spp.yaml
Normal file
@ -0,0 +1,51 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [10,13, 16,30, 33,23] # P3/8
|
||||
- [30,61, 62,45, 59,119] # P4/16
|
||||
- [116,90, 156,198, 373,326] # P5/32
|
||||
|
||||
# darknet53 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [32, 3, 1]], # 0
|
||||
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Bottleneck, [64]],
|
||||
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
|
||||
[-1, 2, Bottleneck, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 5-P3/8
|
||||
[-1, 8, Bottleneck, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 7-P4/16
|
||||
[-1, 8, Bottleneck, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
|
||||
[-1, 4, Bottleneck, [1024]], # 10
|
||||
]
|
||||
|
||||
# YOLOv3-SPP head
|
||||
head:
|
||||
[[-1, 1, Bottleneck, [1024, False]],
|
||||
[-1, 1, SPP, [512, [5, 9, 13]]],
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [1024, 3, 1]], # 15 (P5/32-large)
|
||||
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 8], 1, Concat, [1]], # cat backbone P4
|
||||
[-1, 1, Bottleneck, [512, False]],
|
||||
[-1, 1, Bottleneck, [512, False]],
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]], # 22 (P4/16-medium)
|
||||
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 6], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 1, Bottleneck, [256, False]],
|
||||
[-1, 2, Bottleneck, [256, False]], # 27 (P3/8-small)
|
||||
|
||||
[[27, 22, 15], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
51
yolov7-tracker-example/cfg/baseline/yolov3.yaml
Normal file
51
yolov7-tracker-example/cfg/baseline/yolov3.yaml
Normal file
@ -0,0 +1,51 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [10,13, 16,30, 33,23] # P3/8
|
||||
- [30,61, 62,45, 59,119] # P4/16
|
||||
- [116,90, 156,198, 373,326] # P5/32
|
||||
|
||||
# darknet53 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [32, 3, 1]], # 0
|
||||
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Bottleneck, [64]],
|
||||
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
|
||||
[-1, 2, Bottleneck, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 5-P3/8
|
||||
[-1, 8, Bottleneck, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 7-P4/16
|
||||
[-1, 8, Bottleneck, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
|
||||
[-1, 4, Bottleneck, [1024]], # 10
|
||||
]
|
||||
|
||||
# YOLOv3 head
|
||||
head:
|
||||
[[-1, 1, Bottleneck, [1024, False]],
|
||||
[-1, 1, Conv, [512, [1, 1]]],
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [1024, 3, 1]], # 15 (P5/32-large)
|
||||
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 8], 1, Concat, [1]], # cat backbone P4
|
||||
[-1, 1, Bottleneck, [512, False]],
|
||||
[-1, 1, Bottleneck, [512, False]],
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]], # 22 (P4/16-medium)
|
||||
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[[-1, 6], 1, Concat, [1]], # cat backbone P3
|
||||
[-1, 1, Bottleneck, [256, False]],
|
||||
[-1, 2, Bottleneck, [256, False]], # 27 (P3/8-small)
|
||||
|
||||
[[27, 22, 15], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
52
yolov7-tracker-example/cfg/baseline/yolov4-csp.yaml
Normal file
52
yolov7-tracker-example/cfg/baseline/yolov4-csp.yaml
Normal file
@ -0,0 +1,52 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# CSP-Darknet backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [32, 3, 1]], # 0
|
||||
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Bottleneck, [64]],
|
||||
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
|
||||
[-1, 2, BottleneckCSPC, [128]],
|
||||
[-1, 1, Conv, [256, 3, 2]], # 5-P3/8
|
||||
[-1, 8, BottleneckCSPC, [256]],
|
||||
[-1, 1, Conv, [512, 3, 2]], # 7-P4/16
|
||||
[-1, 8, BottleneckCSPC, [512]],
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 9-P5/32
|
||||
[-1, 4, BottleneckCSPC, [1024]], # 10
|
||||
]
|
||||
|
||||
# CSP-Dark-PAN head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 11
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[8, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, BottleneckCSPB, [256]], # 16
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[6, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
[-1, 2, BottleneckCSPB, [128]], # 21
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-2, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 16], 1, Concat, [1]], # cat
|
||||
[-1, 2, BottleneckCSPB, [256]], # 25
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-2, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 11], 1, Concat, [1]], # cat
|
||||
[-1, 2, BottleneckCSPB, [512]], # 29
|
||||
[-1, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[22,26,30], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
202
yolov7-tracker-example/cfg/deploy/yolov7-d6.yaml
Normal file
202
yolov7-tracker-example/cfg/deploy/yolov7-d6.yaml
Normal file
@ -0,0 +1,202 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# yolov7-d6 backbone
|
||||
backbone:
|
||||
# [from, number, module, args],
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [96, 3, 1]], # 1-P1/2
|
||||
|
||||
[-1, 1, DownC, [192]], # 2-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [192, 1, 1]], # 14
|
||||
|
||||
[-1, 1, DownC, [384]], # 15-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 27
|
||||
|
||||
[-1, 1, DownC, [768]], # 28-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [768, 1, 1]], # 40
|
||||
|
||||
[-1, 1, DownC, [1152]], # 41-P5/32
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1152, 1, 1]], # 53
|
||||
|
||||
[-1, 1, DownC, [1536]], # 54-P6/64
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1536, 1, 1]], # 66
|
||||
]
|
||||
|
||||
# yolov7-d6 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [768]], # 67
|
||||
|
||||
[-1, 1, Conv, [576, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[53, 1, Conv, [576, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [576, 1, 1]], # 83
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[40, 1, Conv, [384, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 99
|
||||
|
||||
[-1, 1, Conv, [192, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[27, 1, Conv, [192, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [192, 1, 1]], # 115
|
||||
|
||||
[-1, 1, DownC, [384]],
|
||||
[[-1, 99], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 129
|
||||
|
||||
[-1, 1, DownC, [576]],
|
||||
[[-1, 83], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [576, 1, 1]], # 143
|
||||
|
||||
[-1, 1, DownC, [768]],
|
||||
[[-1, 67], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [768, 1, 1]], # 157
|
||||
|
||||
[115, 1, Conv, [384, 3, 1]],
|
||||
[129, 1, Conv, [768, 3, 1]],
|
||||
[143, 1, Conv, [1152, 3, 1]],
|
||||
[157, 1, Conv, [1536, 3, 1]],
|
||||
|
||||
[[158,159,160,161], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
180
yolov7-tracker-example/cfg/deploy/yolov7-e6.yaml
Normal file
180
yolov7-tracker-example/cfg/deploy/yolov7-e6.yaml
Normal file
@ -0,0 +1,180 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# yolov7-e6 backbone
|
||||
backbone:
|
||||
# [from, number, module, args],
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [80, 3, 1]], # 1-P1/2
|
||||
|
||||
[-1, 1, DownC, [160]], # 2-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 12
|
||||
|
||||
[-1, 1, DownC, [320]], # 13-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 23
|
||||
|
||||
[-1, 1, DownC, [640]], # 24-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 34
|
||||
|
||||
[-1, 1, DownC, [960]], # 35-P5/32
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [960, 1, 1]], # 45
|
||||
|
||||
[-1, 1, DownC, [1280]], # 46-P6/64
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 56
|
||||
]
|
||||
|
||||
# yolov7-e6 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [640]], # 57
|
||||
|
||||
[-1, 1, Conv, [480, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[45, 1, Conv, [480, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 71
|
||||
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[34, 1, Conv, [320, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 85
|
||||
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[23, 1, Conv, [160, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 99
|
||||
|
||||
[-1, 1, DownC, [320]],
|
||||
[[-1, 85], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 111
|
||||
|
||||
[-1, 1, DownC, [480]],
|
||||
[[-1, 71], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 123
|
||||
|
||||
[-1, 1, DownC, [640]],
|
||||
[[-1, 57], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 135
|
||||
|
||||
[99, 1, Conv, [320, 3, 1]],
|
||||
[111, 1, Conv, [640, 3, 1]],
|
||||
[123, 1, Conv, [960, 3, 1]],
|
||||
[135, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[[136,137,138,139], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
301
yolov7-tracker-example/cfg/deploy/yolov7-e6e.yaml
Normal file
301
yolov7-tracker-example/cfg/deploy/yolov7-e6e.yaml
Normal file
@ -0,0 +1,301 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# yolov7-e6e backbone
|
||||
backbone:
|
||||
# [from, number, module, args],
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [80, 3, 1]], # 1-P1/2
|
||||
|
||||
[-1, 1, DownC, [160]], # 2-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 12
|
||||
[-11, 1, Conv, [64, 1, 1]],
|
||||
[-12, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 22
|
||||
[[-1, -11], 1, Shortcut, [1]], # 23
|
||||
|
||||
[-1, 1, DownC, [320]], # 24-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 34
|
||||
[-11, 1, Conv, [128, 1, 1]],
|
||||
[-12, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 44
|
||||
[[-1, -11], 1, Shortcut, [1]], # 45
|
||||
|
||||
[-1, 1, DownC, [640]], # 46-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 56
|
||||
[-11, 1, Conv, [256, 1, 1]],
|
||||
[-12, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 66
|
||||
[[-1, -11], 1, Shortcut, [1]], # 67
|
||||
|
||||
[-1, 1, DownC, [960]], # 68-P5/32
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [960, 1, 1]], # 78
|
||||
[-11, 1, Conv, [384, 1, 1]],
|
||||
[-12, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [960, 1, 1]], # 88
|
||||
[[-1, -11], 1, Shortcut, [1]], # 89
|
||||
|
||||
[-1, 1, DownC, [1280]], # 90-P6/64
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 100
|
||||
[-11, 1, Conv, [512, 1, 1]],
|
||||
[-12, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 110
|
||||
[[-1, -11], 1, Shortcut, [1]], # 111
|
||||
]
|
||||
|
||||
# yolov7-e6e head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [640]], # 112
|
||||
|
||||
[-1, 1, Conv, [480, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[89, 1, Conv, [480, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 126
|
||||
[-11, 1, Conv, [384, 1, 1]],
|
||||
[-12, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 136
|
||||
[[-1, -11], 1, Shortcut, [1]], # 137
|
||||
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[67, 1, Conv, [320, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 151
|
||||
[-11, 1, Conv, [256, 1, 1]],
|
||||
[-12, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 161
|
||||
[[-1, -11], 1, Shortcut, [1]], # 162
|
||||
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[45, 1, Conv, [160, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 176
|
||||
[-11, 1, Conv, [128, 1, 1]],
|
||||
[-12, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 186
|
||||
[[-1, -11], 1, Shortcut, [1]], # 187
|
||||
|
||||
[-1, 1, DownC, [320]],
|
||||
[[-1, 162], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 199
|
||||
[-11, 1, Conv, [256, 1, 1]],
|
||||
[-12, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 209
|
||||
[[-1, -11], 1, Shortcut, [1]], # 210
|
||||
|
||||
[-1, 1, DownC, [480]],
|
||||
[[-1, 137], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 222
|
||||
[-11, 1, Conv, [384, 1, 1]],
|
||||
[-12, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 232
|
||||
[[-1, -11], 1, Shortcut, [1]], # 233
|
||||
|
||||
[-1, 1, DownC, [640]],
|
||||
[[-1, 112], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 245
|
||||
[-11, 1, Conv, [512, 1, 1]],
|
||||
[-12, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 255
|
||||
[[-1, -11], 1, Shortcut, [1]], # 256
|
||||
|
||||
[187, 1, Conv, [320, 3, 1]],
|
||||
[210, 1, Conv, [640, 3, 1]],
|
||||
[233, 1, Conv, [960, 3, 1]],
|
||||
[256, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[[257,258,259,260], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
112
yolov7-tracker-example/cfg/deploy/yolov7-tiny-silu.yaml
Normal file
112
yolov7-tracker-example/cfg/deploy/yolov7-tiny-silu.yaml
Normal file
@ -0,0 +1,112 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [10,13, 16,30, 33,23] # P3/8
|
||||
- [30,61, 62,45, 59,119] # P4/16
|
||||
- [116,90, 156,198, 373,326] # P5/32
|
||||
|
||||
# YOLOv7-tiny backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [32, 3, 2]], # 0-P1/2
|
||||
|
||||
[-1, 1, Conv, [64, 3, 2]], # 1-P2/4
|
||||
|
||||
[-1, 1, Conv, [32, 1, 1]],
|
||||
[-2, 1, Conv, [32, 1, 1]],
|
||||
[-1, 1, Conv, [32, 3, 1]],
|
||||
[-1, 1, Conv, [32, 3, 1]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [64, 1, 1]], # 7
|
||||
|
||||
[-1, 1, MP, []], # 8-P3/8
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 14
|
||||
|
||||
[-1, 1, MP, []], # 15-P4/16
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 21
|
||||
|
||||
[-1, 1, MP, []], # 22-P5/32
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 28
|
||||
]
|
||||
|
||||
# YOLOv7-tiny head
|
||||
head:
|
||||
[[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, SP, [5]],
|
||||
[-2, 1, SP, [9]],
|
||||
[-3, 1, SP, [13]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[[-1, -7], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 37
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[21, 1, Conv, [128, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 47
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[14, 1, Conv, [64, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [32, 1, 1]],
|
||||
[-2, 1, Conv, [32, 1, 1]],
|
||||
[-1, 1, Conv, [32, 3, 1]],
|
||||
[-1, 1, Conv, [32, 3, 1]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [64, 1, 1]], # 57
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, 47], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 65
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 37], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 73
|
||||
|
||||
[57, 1, Conv, [128, 3, 1]],
|
||||
[65, 1, Conv, [256, 3, 1]],
|
||||
[73, 1, Conv, [512, 3, 1]],
|
||||
|
||||
[[74,75,76], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
112
yolov7-tracker-example/cfg/deploy/yolov7-tiny.yaml
Normal file
112
yolov7-tracker-example/cfg/deploy/yolov7-tiny.yaml
Normal file
@ -0,0 +1,112 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [10,13, 16,30, 33,23] # P3/8
|
||||
- [30,61, 62,45, 59,119] # P4/16
|
||||
- [116,90, 156,198, 373,326] # P5/32
|
||||
|
||||
# yolov7-tiny backbone
|
||||
backbone:
|
||||
# [from, number, module, args] c2, k=1, s=1, p=None, g=1, act=True
|
||||
[[-1, 1, Conv, [32, 3, 2, None, 1, nn.LeakyReLU(0.1)]], # 0-P1/2
|
||||
|
||||
[-1, 1, Conv, [64, 3, 2, None, 1, nn.LeakyReLU(0.1)]], # 1-P2/4
|
||||
|
||||
[-1, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 7
|
||||
|
||||
[-1, 1, MP, []], # 8-P3/8
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 14
|
||||
|
||||
[-1, 1, MP, []], # 15-P4/16
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 21
|
||||
|
||||
[-1, 1, MP, []], # 22-P5/32
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 28
|
||||
]
|
||||
|
||||
# yolov7-tiny head
|
||||
head:
|
||||
[[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, SP, [5]],
|
||||
[-2, 1, SP, [9]],
|
||||
[-3, 1, SP, [13]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -7], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 37
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[21, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 47
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[14, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 57
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, 47], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 65
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, 37], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 73
|
||||
|
||||
[57, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[65, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[73, 1, Conv, [512, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
|
||||
[[74,75,76], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
158
yolov7-tracker-example/cfg/deploy/yolov7-w6.yaml
Normal file
158
yolov7-tracker-example/cfg/deploy/yolov7-w6.yaml
Normal file
@ -0,0 +1,158 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# yolov7-w6 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [64, 3, 1]], # 1-P1/2
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]], # 2-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 10
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2]], # 11-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 19
|
||||
|
||||
[-1, 1, Conv, [512, 3, 2]], # 20-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 28
|
||||
|
||||
[-1, 1, Conv, [768, 3, 2]], # 29-P5/32
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [768, 1, 1]], # 37
|
||||
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 38-P6/64
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1024, 1, 1]], # 46
|
||||
]
|
||||
|
||||
# yolov7-w6 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 47
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[37, 1, Conv, [384, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 59
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[28, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 71
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[19, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 83
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 71], 1, Concat, [1]], # cat
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 93
|
||||
|
||||
[-1, 1, Conv, [384, 3, 2]],
|
||||
[[-1, 59], 1, Concat, [1]], # cat
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 103
|
||||
|
||||
[-1, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 47], 1, Concat, [1]], # cat
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 113
|
||||
|
||||
[83, 1, Conv, [256, 3, 1]],
|
||||
[93, 1, Conv, [512, 3, 1]],
|
||||
[103, 1, Conv, [768, 3, 1]],
|
||||
[113, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[[114,115,116,117], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
140
yolov7-tracker-example/cfg/deploy/yolov7.yaml
Normal file
140
yolov7-tracker-example/cfg/deploy/yolov7.yaml
Normal file
@ -0,0 +1,140 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# yolov7 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [32, 3, 1]], # 0
|
||||
|
||||
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 11
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-3, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 16-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 24
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-3, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 29-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1024, 1, 1]], # 37
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-3, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 42-P5/32
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1024, 1, 1]], # 50
|
||||
]
|
||||
|
||||
# yolov7 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 51
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[37, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 63
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[24, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 75
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-3, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, -3, 63], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 88
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-3, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, -3, 51], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 101
|
||||
|
||||
[75, 1, RepConv, [256, 3, 1]],
|
||||
[88, 1, RepConv, [512, 3, 1]],
|
||||
[101, 1, RepConv, [1024, 3, 1]],
|
||||
|
||||
[[102,103,104], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
156
yolov7-tracker-example/cfg/deploy/yolov7x.yaml
Normal file
156
yolov7-tracker-example/cfg/deploy/yolov7x.yaml
Normal file
@ -0,0 +1,156 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# yolov7x backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [40, 3, 1]], # 0
|
||||
|
||||
[-1, 1, Conv, [80, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Conv, [80, 3, 1]],
|
||||
|
||||
[-1, 1, Conv, [160, 3, 2]], # 3-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 13
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-3, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, Conv, [160, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 18-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 28
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-3, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, Conv, [320, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 33-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 43
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [640, 1, 1]],
|
||||
[-3, 1, Conv, [640, 1, 1]],
|
||||
[-1, 1, Conv, [640, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 48-P5/32
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 58
|
||||
]
|
||||
|
||||
# yolov7x head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [640]], # 59
|
||||
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[43, 1, Conv, [320, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 73
|
||||
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[28, 1, Conv, [160, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 87
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-3, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, Conv, [160, 3, 2]],
|
||||
[[-1, -3, 73], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 102
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-3, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, Conv, [320, 3, 2]],
|
||||
[[-1, -3, 59], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 117
|
||||
|
||||
[87, 1, Conv, [320, 3, 1]],
|
||||
[102, 1, Conv, [640, 3, 1]],
|
||||
[117, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[[118,119,120], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
207
yolov7-tracker-example/cfg/training/yolov7-d6.yaml
Normal file
207
yolov7-tracker-example/cfg/training/yolov7-d6.yaml
Normal file
@ -0,0 +1,207 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# yolov7 backbone
|
||||
backbone:
|
||||
# [from, number, module, args],
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [96, 3, 1]], # 1-P1/2
|
||||
|
||||
[-1, 1, DownC, [192]], # 2-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [192, 1, 1]], # 14
|
||||
|
||||
[-1, 1, DownC, [384]], # 15-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 27
|
||||
|
||||
[-1, 1, DownC, [768]], # 28-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [768, 1, 1]], # 40
|
||||
|
||||
[-1, 1, DownC, [1152]], # 41-P5/32
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1152, 1, 1]], # 53
|
||||
|
||||
[-1, 1, DownC, [1536]], # 54-P6/64
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1536, 1, 1]], # 66
|
||||
]
|
||||
|
||||
# yolov7 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [768]], # 67
|
||||
|
||||
[-1, 1, Conv, [576, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[53, 1, Conv, [576, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [576, 1, 1]], # 83
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[40, 1, Conv, [384, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 99
|
||||
|
||||
[-1, 1, Conv, [192, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[27, 1, Conv, [192, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [192, 1, 1]], # 115
|
||||
|
||||
[-1, 1, DownC, [384]],
|
||||
[[-1, 99], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 129
|
||||
|
||||
[-1, 1, DownC, [576]],
|
||||
[[-1, 83], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [576, 1, 1]], # 143
|
||||
|
||||
[-1, 1, DownC, [768]],
|
||||
[[-1, 67], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8, -9, -10], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [768, 1, 1]], # 157
|
||||
|
||||
[115, 1, Conv, [384, 3, 1]],
|
||||
[129, 1, Conv, [768, 3, 1]],
|
||||
[143, 1, Conv, [1152, 3, 1]],
|
||||
[157, 1, Conv, [1536, 3, 1]],
|
||||
|
||||
[115, 1, Conv, [384, 3, 1]],
|
||||
[99, 1, Conv, [768, 3, 1]],
|
||||
[83, 1, Conv, [1152, 3, 1]],
|
||||
[67, 1, Conv, [1536, 3, 1]],
|
||||
|
||||
[[158,159,160,161,162,163,164,165], 1, IAuxDetect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
185
yolov7-tracker-example/cfg/training/yolov7-e6.yaml
Normal file
185
yolov7-tracker-example/cfg/training/yolov7-e6.yaml
Normal file
@ -0,0 +1,185 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# yolov7 backbone
|
||||
backbone:
|
||||
# [from, number, module, args],
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [80, 3, 1]], # 1-P1/2
|
||||
|
||||
[-1, 1, DownC, [160]], # 2-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 12
|
||||
|
||||
[-1, 1, DownC, [320]], # 13-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 23
|
||||
|
||||
[-1, 1, DownC, [640]], # 24-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 34
|
||||
|
||||
[-1, 1, DownC, [960]], # 35-P5/32
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [960, 1, 1]], # 45
|
||||
|
||||
[-1, 1, DownC, [1280]], # 46-P6/64
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 56
|
||||
]
|
||||
|
||||
# yolov7 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [640]], # 57
|
||||
|
||||
[-1, 1, Conv, [480, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[45, 1, Conv, [480, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 71
|
||||
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[34, 1, Conv, [320, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 85
|
||||
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[23, 1, Conv, [160, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 99
|
||||
|
||||
[-1, 1, DownC, [320]],
|
||||
[[-1, 85], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 111
|
||||
|
||||
[-1, 1, DownC, [480]],
|
||||
[[-1, 71], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 123
|
||||
|
||||
[-1, 1, DownC, [640]],
|
||||
[[-1, 57], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 135
|
||||
|
||||
[99, 1, Conv, [320, 3, 1]],
|
||||
[111, 1, Conv, [640, 3, 1]],
|
||||
[123, 1, Conv, [960, 3, 1]],
|
||||
[135, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[99, 1, Conv, [320, 3, 1]],
|
||||
[85, 1, Conv, [640, 3, 1]],
|
||||
[71, 1, Conv, [960, 3, 1]],
|
||||
[57, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[[136,137,138,139,140,141,142,143], 1, IAuxDetect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
306
yolov7-tracker-example/cfg/training/yolov7-e6e.yaml
Normal file
306
yolov7-tracker-example/cfg/training/yolov7-e6e.yaml
Normal file
@ -0,0 +1,306 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# yolov7 backbone
|
||||
backbone:
|
||||
# [from, number, module, args],
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [80, 3, 1]], # 1-P1/2
|
||||
|
||||
[-1, 1, DownC, [160]], # 2-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 12
|
||||
[-11, 1, Conv, [64, 1, 1]],
|
||||
[-12, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 22
|
||||
[[-1, -11], 1, Shortcut, [1]], # 23
|
||||
|
||||
[-1, 1, DownC, [320]], # 24-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 34
|
||||
[-11, 1, Conv, [128, 1, 1]],
|
||||
[-12, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 44
|
||||
[[-1, -11], 1, Shortcut, [1]], # 45
|
||||
|
||||
[-1, 1, DownC, [640]], # 46-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 56
|
||||
[-11, 1, Conv, [256, 1, 1]],
|
||||
[-12, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 66
|
||||
[[-1, -11], 1, Shortcut, [1]], # 67
|
||||
|
||||
[-1, 1, DownC, [960]], # 68-P5/32
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [960, 1, 1]], # 78
|
||||
[-11, 1, Conv, [384, 1, 1]],
|
||||
[-12, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [960, 1, 1]], # 88
|
||||
[[-1, -11], 1, Shortcut, [1]], # 89
|
||||
|
||||
[-1, 1, DownC, [1280]], # 90-P6/64
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 100
|
||||
[-11, 1, Conv, [512, 1, 1]],
|
||||
[-12, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 110
|
||||
[[-1, -11], 1, Shortcut, [1]], # 111
|
||||
]
|
||||
|
||||
# yolov7 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [640]], # 112
|
||||
|
||||
[-1, 1, Conv, [480, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[89, 1, Conv, [480, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 126
|
||||
[-11, 1, Conv, [384, 1, 1]],
|
||||
[-12, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 136
|
||||
[[-1, -11], 1, Shortcut, [1]], # 137
|
||||
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[67, 1, Conv, [320, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 151
|
||||
[-11, 1, Conv, [256, 1, 1]],
|
||||
[-12, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 161
|
||||
[[-1, -11], 1, Shortcut, [1]], # 162
|
||||
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[45, 1, Conv, [160, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 176
|
||||
[-11, 1, Conv, [128, 1, 1]],
|
||||
[-12, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 186
|
||||
[[-1, -11], 1, Shortcut, [1]], # 187
|
||||
|
||||
[-1, 1, DownC, [320]],
|
||||
[[-1, 162], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 199
|
||||
[-11, 1, Conv, [256, 1, 1]],
|
||||
[-12, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 209
|
||||
[[-1, -11], 1, Shortcut, [1]], # 210
|
||||
|
||||
[-1, 1, DownC, [480]],
|
||||
[[-1, 137], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 222
|
||||
[-11, 1, Conv, [384, 1, 1]],
|
||||
[-12, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [480, 1, 1]], # 232
|
||||
[[-1, -11], 1, Shortcut, [1]], # 233
|
||||
|
||||
[-1, 1, DownC, [640]],
|
||||
[[-1, 112], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 245
|
||||
[-11, 1, Conv, [512, 1, 1]],
|
||||
[-12, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 255
|
||||
[[-1, -11], 1, Shortcut, [1]], # 256
|
||||
|
||||
[187, 1, Conv, [320, 3, 1]],
|
||||
[210, 1, Conv, [640, 3, 1]],
|
||||
[233, 1, Conv, [960, 3, 1]],
|
||||
[256, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[186, 1, Conv, [320, 3, 1]],
|
||||
[161, 1, Conv, [640, 3, 1]],
|
||||
[136, 1, Conv, [960, 3, 1]],
|
||||
[112, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[[257,258,259,260,261,262,263,264], 1, IAuxDetect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
112
yolov7-tracker-example/cfg/training/yolov7-tiny.yaml
Normal file
112
yolov7-tracker-example/cfg/training/yolov7-tiny.yaml
Normal file
@ -0,0 +1,112 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [10,13, 16,30, 33,23] # P3/8
|
||||
- [30,61, 62,45, 59,119] # P4/16
|
||||
- [116,90, 156,198, 373,326] # P5/32
|
||||
|
||||
# yolov7-tiny backbone
|
||||
backbone:
|
||||
# [from, number, module, args] c2, k=1, s=1, p=None, g=1, act=True
|
||||
[[-1, 1, Conv, [32, 3, 2, None, 1, nn.LeakyReLU(0.1)]], # 0-P1/2
|
||||
|
||||
[-1, 1, Conv, [64, 3, 2, None, 1, nn.LeakyReLU(0.1)]], # 1-P2/4
|
||||
|
||||
[-1, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 7
|
||||
|
||||
[-1, 1, MP, []], # 8-P3/8
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 14
|
||||
|
||||
[-1, 1, MP, []], # 15-P4/16
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 21
|
||||
|
||||
[-1, 1, MP, []], # 22-P5/32
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 28
|
||||
]
|
||||
|
||||
# yolov7-tiny head
|
||||
head:
|
||||
[[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, SP, [5]],
|
||||
[-2, 1, SP, [9]],
|
||||
[-3, 1, SP, [13]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -7], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 37
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[21, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 47
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[14, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 57
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, 47], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 65
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, 37], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-2, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[[-1, -2, -3, -4], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 73
|
||||
|
||||
[57, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[65, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
[73, 1, Conv, [512, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
|
||||
|
||||
[[74,75,76], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
163
yolov7-tracker-example/cfg/training/yolov7-w6.yaml
Normal file
163
yolov7-tracker-example/cfg/training/yolov7-w6.yaml
Normal file
@ -0,0 +1,163 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [ 19,27, 44,40, 38,94 ] # P3/8
|
||||
- [ 96,68, 86,152, 180,137 ] # P4/16
|
||||
- [ 140,301, 303,264, 238,542 ] # P5/32
|
||||
- [ 436,615, 739,380, 925,792 ] # P6/64
|
||||
|
||||
# yolov7 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, ReOrg, []], # 0
|
||||
[-1, 1, Conv, [64, 3, 1]], # 1-P1/2
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]], # 2-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 10
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2]], # 11-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 19
|
||||
|
||||
[-1, 1, Conv, [512, 3, 2]], # 20-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 28
|
||||
|
||||
[-1, 1, Conv, [768, 3, 2]], # 29-P5/32
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[-1, 1, Conv, [384, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [768, 1, 1]], # 37
|
||||
|
||||
[-1, 1, Conv, [1024, 3, 2]], # 38-P6/64
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1024, 1, 1]], # 46
|
||||
]
|
||||
|
||||
# yolov7 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 47
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[37, 1, Conv, [384, 1, 1]], # route backbone P5
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 59
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[28, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 71
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[19, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 83
|
||||
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, 71], 1, Concat, [1]], # cat
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 93
|
||||
|
||||
[-1, 1, Conv, [384, 3, 2]],
|
||||
[[-1, 59], 1, Concat, [1]], # cat
|
||||
|
||||
[-1, 1, Conv, [384, 1, 1]],
|
||||
[-2, 1, Conv, [384, 1, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[-1, 1, Conv, [192, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [384, 1, 1]], # 103
|
||||
|
||||
[-1, 1, Conv, [512, 3, 2]],
|
||||
[[-1, 47], 1, Concat, [1]], # cat
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 113
|
||||
|
||||
[83, 1, Conv, [256, 3, 1]],
|
||||
[93, 1, Conv, [512, 3, 1]],
|
||||
[103, 1, Conv, [768, 3, 1]],
|
||||
[113, 1, Conv, [1024, 3, 1]],
|
||||
|
||||
[83, 1, Conv, [320, 3, 1]],
|
||||
[71, 1, Conv, [640, 3, 1]],
|
||||
[59, 1, Conv, [960, 3, 1]],
|
||||
[47, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[[114,115,116,117,118,119,120,121], 1, IAuxDetect, [nc, anchors]], # Detect(P3, P4, P5, P6)
|
||||
]
|
140
yolov7-tracker-example/cfg/training/yolov7.yaml
Normal file
140
yolov7-tracker-example/cfg/training/yolov7.yaml
Normal file
@ -0,0 +1,140 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# yolov7 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [32, 3, 1]], # 0
|
||||
|
||||
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
|
||||
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 11
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-3, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 16-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 24
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-3, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 29-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1024, 1, 1]], # 37
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-3, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 42-P5/32
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1024, 1, 1]], # 50
|
||||
]
|
||||
|
||||
# yolov7 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [512]], # 51
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[37, 1, Conv, [256, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 63
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[24, 1, Conv, [128, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [128, 1, 1]], # 75
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-3, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 2]],
|
||||
[[-1, -3, 63], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [256, 1, 1]], # 88
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-3, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 2]],
|
||||
[[-1, -3, 51], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [512, 1, 1]], # 101
|
||||
|
||||
[75, 1, RepConv, [256, 3, 1]],
|
||||
[88, 1, RepConv, [512, 3, 1]],
|
||||
[101, 1, RepConv, [1024, 3, 1]],
|
||||
|
||||
[[102,103,104], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
156
yolov7-tracker-example/cfg/training/yolov7x.yaml
Normal file
156
yolov7-tracker-example/cfg/training/yolov7x.yaml
Normal file
@ -0,0 +1,156 @@
|
||||
# parameters
|
||||
nc: 80 # number of classes
|
||||
depth_multiple: 1.0 # model depth multiple
|
||||
width_multiple: 1.0 # layer channel multiple
|
||||
|
||||
# anchors
|
||||
anchors:
|
||||
- [12,16, 19,36, 40,28] # P3/8
|
||||
- [36,75, 76,55, 72,146] # P4/16
|
||||
- [142,110, 192,243, 459,401] # P5/32
|
||||
|
||||
# yolov7 backbone
|
||||
backbone:
|
||||
# [from, number, module, args]
|
||||
[[-1, 1, Conv, [40, 3, 1]], # 0
|
||||
|
||||
[-1, 1, Conv, [80, 3, 2]], # 1-P1/2
|
||||
[-1, 1, Conv, [80, 3, 1]],
|
||||
|
||||
[-1, 1, Conv, [160, 3, 2]], # 3-P2/4
|
||||
[-1, 1, Conv, [64, 1, 1]],
|
||||
[-2, 1, Conv, [64, 1, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[-1, 1, Conv, [64, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 13
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-3, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, Conv, [160, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 18-P3/8
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 28
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-3, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, Conv, [320, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 33-P4/16
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 43
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [640, 1, 1]],
|
||||
[-3, 1, Conv, [640, 1, 1]],
|
||||
[-1, 1, Conv, [640, 3, 2]],
|
||||
[[-1, -3], 1, Concat, [1]], # 48-P5/32
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [1280, 1, 1]], # 58
|
||||
]
|
||||
|
||||
# yolov7 head
|
||||
head:
|
||||
[[-1, 1, SPPCSPC, [640]], # 59
|
||||
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[43, 1, Conv, [320, 1, 1]], # route backbone P4
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 73
|
||||
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
|
||||
[28, 1, Conv, [160, 1, 1]], # route backbone P3
|
||||
[[-1, -2], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [128, 1, 1]],
|
||||
[-2, 1, Conv, [128, 1, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[-1, 1, Conv, [128, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [160, 1, 1]], # 87
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [160, 1, 1]],
|
||||
[-3, 1, Conv, [160, 1, 1]],
|
||||
[-1, 1, Conv, [160, 3, 2]],
|
||||
[[-1, -3, 73], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [256, 1, 1]],
|
||||
[-2, 1, Conv, [256, 1, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[-1, 1, Conv, [256, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [320, 1, 1]], # 102
|
||||
|
||||
[-1, 1, MP, []],
|
||||
[-1, 1, Conv, [320, 1, 1]],
|
||||
[-3, 1, Conv, [320, 1, 1]],
|
||||
[-1, 1, Conv, [320, 3, 2]],
|
||||
[[-1, -3, 59], 1, Concat, [1]],
|
||||
|
||||
[-1, 1, Conv, [512, 1, 1]],
|
||||
[-2, 1, Conv, [512, 1, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[-1, 1, Conv, [512, 3, 1]],
|
||||
[[-1, -3, -5, -7, -8], 1, Concat, [1]],
|
||||
[-1, 1, Conv, [640, 1, 1]], # 117
|
||||
|
||||
[87, 1, Conv, [320, 3, 1]],
|
||||
[102, 1, Conv, [640, 3, 1]],
|
||||
[117, 1, Conv, [1280, 3, 1]],
|
||||
|
||||
[[118,119,120], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
|
||||
]
|
23
yolov7-tracker-example/data/coco.yaml
Normal file
23
yolov7-tracker-example/data/coco.yaml
Normal file
@ -0,0 +1,23 @@
|
||||
# COCO 2017 dataset http://cocodataset.org
|
||||
|
||||
# download command/URL (optional)
|
||||
download: bash ./scripts/get_coco.sh
|
||||
|
||||
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
|
||||
train: ./coco/train2017.txt # 118287 images
|
||||
val: ./coco/val2017.txt # 5000 images
|
||||
test: ./coco/test-dev2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794
|
||||
|
||||
# number of classes
|
||||
nc: 80
|
||||
|
||||
# class names
|
||||
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
|
||||
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
|
||||
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
|
||||
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
|
||||
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
|
||||
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
|
||||
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
|
||||
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
|
||||
'hair drier', 'toothbrush' ]
|
29
yolov7-tracker-example/data/hyp.scratch.custom.yaml
Normal file
29
yolov7-tracker-example/data/hyp.scratch.custom.yaml
Normal file
@ -0,0 +1,29 @@
|
||||
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
|
||||
lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
|
||||
momentum: 0.937 # SGD momentum/Adam beta1
|
||||
weight_decay: 0.0005 # optimizer weight decay 5e-4
|
||||
warmup_epochs: 3.0 # warmup epochs (fractions ok)
|
||||
warmup_momentum: 0.8 # warmup initial momentum
|
||||
warmup_bias_lr: 0.1 # warmup initial bias lr
|
||||
box: 0.05 # box loss gain
|
||||
cls: 0.3 # cls loss gain
|
||||
cls_pw: 1.0 # cls BCELoss positive_weight
|
||||
obj: 0.7 # obj loss gain (scale with pixels)
|
||||
obj_pw: 1.0 # obj BCELoss positive_weight
|
||||
iou_t: 0.20 # IoU training threshold
|
||||
anchor_t: 4.0 # anchor-multiple threshold
|
||||
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
|
||||
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
|
||||
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
|
||||
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
|
||||
degrees: 0.0 # image rotation (+/- deg)
|
||||
translate: 0.2 # image translation (+/- fraction)
|
||||
scale: 0.5 # image scale (+/- gain)
|
||||
shear: 0.0 # image shear (+/- deg)
|
||||
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
|
||||
flipud: 0.0 # image flip up-down (probability)
|
||||
fliplr: 0.5 # image flip left-right (probability)
|
||||
mosaic: 1.0 # image mosaic (probability)
|
||||
mixup: 0.0 # image mixup (probability)
|
||||
copy_paste: 0.0 # image copy paste (probability)
|
||||
paste_in: 0.0 # image copy paste (probability)
|
29
yolov7-tracker-example/data/hyp.scratch.p5.yaml
Normal file
29
yolov7-tracker-example/data/hyp.scratch.p5.yaml
Normal file
@ -0,0 +1,29 @@
|
||||
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
|
||||
lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
|
||||
momentum: 0.937 # SGD momentum/Adam beta1
|
||||
weight_decay: 0.0005 # optimizer weight decay 5e-4
|
||||
warmup_epochs: 3.0 # warmup epochs (fractions ok)
|
||||
warmup_momentum: 0.8 # warmup initial momentum
|
||||
warmup_bias_lr: 0.1 # warmup initial bias lr
|
||||
box: 0.05 # box loss gain
|
||||
cls: 0.3 # cls loss gain
|
||||
cls_pw: 1.0 # cls BCELoss positive_weight
|
||||
obj: 0.7 # obj loss gain (scale with pixels)
|
||||
obj_pw: 1.0 # obj BCELoss positive_weight
|
||||
iou_t: 0.20 # IoU training threshold
|
||||
anchor_t: 4.0 # anchor-multiple threshold
|
||||
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
|
||||
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
|
||||
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
|
||||
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
|
||||
degrees: 0.0 # image rotation (+/- deg)
|
||||
translate: 0.2 # image translation (+/- fraction)
|
||||
scale: 0.9 # image scale (+/- gain)
|
||||
shear: 0.0 # image shear (+/- deg)
|
||||
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
|
||||
flipud: 0.0 # image flip up-down (probability)
|
||||
fliplr: 0.5 # image flip left-right (probability)
|
||||
mosaic: 1.0 # image mosaic (probability)
|
||||
mixup: 0.15 # image mixup (probability)
|
||||
copy_paste: 0.0 # image copy paste (probability)
|
||||
paste_in: 0.15 # image copy paste (probability)
|
29
yolov7-tracker-example/data/hyp.scratch.p6.yaml
Normal file
29
yolov7-tracker-example/data/hyp.scratch.p6.yaml
Normal file
@ -0,0 +1,29 @@
|
||||
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
|
||||
lrf: 0.2 # final OneCycleLR learning rate (lr0 * lrf)
|
||||
momentum: 0.937 # SGD momentum/Adam beta1
|
||||
weight_decay: 0.0005 # optimizer weight decay 5e-4
|
||||
warmup_epochs: 3.0 # warmup epochs (fractions ok)
|
||||
warmup_momentum: 0.8 # warmup initial momentum
|
||||
warmup_bias_lr: 0.1 # warmup initial bias lr
|
||||
box: 0.05 # box loss gain
|
||||
cls: 0.3 # cls loss gain
|
||||
cls_pw: 1.0 # cls BCELoss positive_weight
|
||||
obj: 0.7 # obj loss gain (scale with pixels)
|
||||
obj_pw: 1.0 # obj BCELoss positive_weight
|
||||
iou_t: 0.20 # IoU training threshold
|
||||
anchor_t: 4.0 # anchor-multiple threshold
|
||||
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
|
||||
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
|
||||
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
|
||||
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
|
||||
degrees: 0.0 # image rotation (+/- deg)
|
||||
translate: 0.2 # image translation (+/- fraction)
|
||||
scale: 0.9 # image scale (+/- gain)
|
||||
shear: 0.0 # image shear (+/- deg)
|
||||
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
|
||||
flipud: 0.0 # image flip up-down (probability)
|
||||
fliplr: 0.5 # image flip left-right (probability)
|
||||
mosaic: 1.0 # image mosaic (probability)
|
||||
mixup: 0.15 # image mixup (probability)
|
||||
copy_paste: 0.0 # image copy paste (probability)
|
||||
paste_in: 0.15 # image copy paste (probability)
|
29
yolov7-tracker-example/data/hyp.scratch.tiny.yaml
Normal file
29
yolov7-tracker-example/data/hyp.scratch.tiny.yaml
Normal file
@ -0,0 +1,29 @@
|
||||
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
|
||||
lrf: 0.01 # final OneCycleLR learning rate (lr0 * lrf)
|
||||
momentum: 0.937 # SGD momentum/Adam beta1
|
||||
weight_decay: 0.0005 # optimizer weight decay 5e-4
|
||||
warmup_epochs: 3.0 # warmup epochs (fractions ok)
|
||||
warmup_momentum: 0.8 # warmup initial momentum
|
||||
warmup_bias_lr: 0.1 # warmup initial bias lr
|
||||
box: 0.05 # box loss gain
|
||||
cls: 0.5 # cls loss gain
|
||||
cls_pw: 1.0 # cls BCELoss positive_weight
|
||||
obj: 1.0 # obj loss gain (scale with pixels)
|
||||
obj_pw: 1.0 # obj BCELoss positive_weight
|
||||
iou_t: 0.20 # IoU training threshold
|
||||
anchor_t: 4.0 # anchor-multiple threshold
|
||||
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
|
||||
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
|
||||
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
|
||||
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
|
||||
degrees: 0.0 # image rotation (+/- deg)
|
||||
translate: 0.1 # image translation (+/- fraction)
|
||||
scale: 0.5 # image scale (+/- gain)
|
||||
shear: 0.0 # image shear (+/- deg)
|
||||
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
|
||||
flipud: 0.0 # image flip up-down (probability)
|
||||
fliplr: 0.5 # image flip left-right (probability)
|
||||
mosaic: 1.0 # image mosaic (probability)
|
||||
mixup: 0.05 # image mixup (probability)
|
||||
copy_paste: 0.0 # image copy paste (probability)
|
||||
paste_in: 0.05 # image copy paste (probability)
|
7
yolov7-tracker-example/data/mot17.yaml
Normal file
7
yolov7-tracker-example/data/mot17.yaml
Normal file
@ -0,0 +1,7 @@
|
||||
train: ./mot17/train.txt
|
||||
val: ./mot17/val.txt
|
||||
test: ./mot17/val.txt
|
||||
|
||||
nc: 1
|
||||
|
||||
names: ['pedestrain']
|
7
yolov7-tracker-example/data/uavdt.yaml
Normal file
7
yolov7-tracker-example/data/uavdt.yaml
Normal file
@ -0,0 +1,7 @@
|
||||
train: ./uavdt/train.txt
|
||||
val: ./uavdt/test.txt
|
||||
test: ./uavdt/test.txt
|
||||
|
||||
nc: 1
|
||||
|
||||
names: ['car']
|
8
yolov7-tracker-example/data/visdrone_all.yaml
Normal file
8
yolov7-tracker-example/data/visdrone_all.yaml
Normal file
@ -0,0 +1,8 @@
|
||||
|
||||
train: ./visdrone/train.txt
|
||||
val: ./visdrone/val.txt
|
||||
test: ./visdrone/test.txt
|
||||
|
||||
nc: 10
|
||||
|
||||
names: ['pedestrain', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor']
|
8
yolov7-tracker-example/data/visdrone_half_car.yaml
Normal file
8
yolov7-tracker-example/data/visdrone_half_car.yaml
Normal file
@ -0,0 +1,8 @@
|
||||
|
||||
train: ./visdrone/train.txt
|
||||
val: ./visdrone/val.txt
|
||||
test: ./visdrone/test.txt
|
||||
|
||||
nc: 4
|
||||
|
||||
names: ['car', 'van', 'truck', 'bus']
|
184
yolov7-tracker-example/detect.py
Normal file
184
yolov7-tracker-example/detect.py
Normal file
@ -0,0 +1,184 @@
|
||||
import argparse
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
import cv2
|
||||
import torch
|
||||
import torch.backends.cudnn as cudnn
|
||||
from numpy import random
|
||||
|
||||
from models.experimental import attempt_load
|
||||
from utils.datasets import LoadStreams, LoadImages
|
||||
from utils.general import check_img_size, check_requirements, check_imshow, non_max_suppression, apply_classifier, \
|
||||
scale_coords, xyxy2xywh, strip_optimizer, set_logging, increment_path
|
||||
from utils.plots import plot_one_box
|
||||
from utils.torch_utils import select_device, load_classifier, time_synchronized, TracedModel
|
||||
|
||||
|
||||
def detect(save_img=False):
|
||||
source, weights, view_img, save_txt, imgsz, trace = opt.source, opt.weights, opt.view_img, opt.save_txt, opt.img_size, not opt.no_trace
|
||||
save_img = not opt.nosave and not source.endswith('.txt') # save inference images
|
||||
webcam = source.isnumeric() or source.endswith('.txt') or source.lower().startswith(
|
||||
('rtsp://', 'rtmp://', 'http://', 'https://'))
|
||||
|
||||
# Directories
|
||||
save_dir = Path(increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok)) # increment run
|
||||
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
|
||||
|
||||
# Initialize
|
||||
set_logging()
|
||||
device = select_device(opt.device)
|
||||
half = device.type != 'cpu' # half precision only supported on CUDA
|
||||
|
||||
# Load model
|
||||
model = attempt_load(weights, map_location=device) # load FP32 model
|
||||
stride = int(model.stride.max()) # model stride
|
||||
imgsz = check_img_size(imgsz, s=stride) # check img_size
|
||||
|
||||
if trace:
|
||||
model = TracedModel(model, device, opt.img_size)
|
||||
|
||||
if half:
|
||||
model.half() # to FP16
|
||||
|
||||
# Second-stage classifier
|
||||
classify = False
|
||||
if classify:
|
||||
modelc = load_classifier(name='resnet101', n=2) # initialize
|
||||
modelc.load_state_dict(torch.load('weights/resnet101.pt', map_location=device)['model']).to(device).eval()
|
||||
|
||||
# Set Dataloader
|
||||
vid_path, vid_writer = None, None
|
||||
if webcam:
|
||||
view_img = check_imshow()
|
||||
cudnn.benchmark = True # set True to speed up constant image size inference
|
||||
dataset = LoadStreams(source, img_size=imgsz, stride=stride)
|
||||
else:
|
||||
dataset = LoadImages(source, img_size=imgsz, stride=stride)
|
||||
|
||||
# Get names and colors
|
||||
names = model.module.names if hasattr(model, 'module') else model.names
|
||||
colors = [[random.randint(0, 255) for _ in range(3)] for _ in names]
|
||||
|
||||
# Run inference
|
||||
if device.type != 'cpu':
|
||||
model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters()))) # run once
|
||||
t0 = time.time()
|
||||
for path, img, im0s, vid_cap in dataset:
|
||||
img = torch.from_numpy(img).to(device)
|
||||
img = img.half() if half else img.float() # uint8 to fp16/32
|
||||
img /= 255.0 # 0 - 255 to 0.0 - 1.0
|
||||
if img.ndimension() == 3:
|
||||
img = img.unsqueeze(0)
|
||||
|
||||
# Inference
|
||||
t1 = time_synchronized()
|
||||
pred = model(img, augment=opt.augment)[0]
|
||||
|
||||
# Apply NMS
|
||||
pred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms)
|
||||
t2 = time_synchronized()
|
||||
|
||||
# Apply Classifier
|
||||
if classify:
|
||||
pred = apply_classifier(pred, modelc, img, im0s)
|
||||
|
||||
# Process detections
|
||||
for i, det in enumerate(pred): # detections per image
|
||||
if webcam: # batch_size >= 1
|
||||
p, s, im0, frame = path[i], '%g: ' % i, im0s[i].copy(), dataset.count
|
||||
else:
|
||||
p, s, im0, frame = path, '', im0s, getattr(dataset, 'frame', 0)
|
||||
|
||||
p = Path(p) # to Path
|
||||
save_path = str(save_dir / p.name) # img.jpg
|
||||
txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # img.txt
|
||||
s += '%gx%g ' % img.shape[2:] # print string
|
||||
gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh
|
||||
if len(det):
|
||||
# Rescale boxes from img_size to im0 size
|
||||
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
|
||||
|
||||
# Print results
|
||||
for c in det[:, -1].unique():
|
||||
n = (det[:, -1] == c).sum() # detections per class
|
||||
s += f"{n} {names[int(c)]}{'s' * (n > 1)}, " # add to string
|
||||
|
||||
# Write results
|
||||
for *xyxy, conf, cls in reversed(det):
|
||||
if save_txt: # Write to file
|
||||
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
|
||||
line = (cls, *xywh, conf) if opt.save_conf else (cls, *xywh) # label format
|
||||
with open(txt_path + '.txt', 'a') as f:
|
||||
f.write(('%g ' * len(line)).rstrip() % line + '\n')
|
||||
|
||||
if save_img or view_img: # Add bbox to image
|
||||
label = f'{names[int(cls)]} {conf:.2f}'
|
||||
plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
|
||||
|
||||
# Print time (inference + NMS)
|
||||
#print(f'{s}Done. ({t2 - t1:.3f}s)')
|
||||
|
||||
# Stream results
|
||||
if view_img:
|
||||
cv2.imshow(str(p), im0)
|
||||
cv2.waitKey(1) # 1 millisecond
|
||||
|
||||
# Save results (image with detections)
|
||||
if save_img:
|
||||
if dataset.mode == 'image':
|
||||
cv2.imwrite(save_path, im0)
|
||||
print(f" The image with the result is saved in: {save_path}")
|
||||
else: # 'video' or 'stream'
|
||||
if vid_path != save_path: # new video
|
||||
vid_path = save_path
|
||||
if isinstance(vid_writer, cv2.VideoWriter):
|
||||
vid_writer.release() # release previous video writer
|
||||
if vid_cap: # video
|
||||
fps = vid_cap.get(cv2.CAP_PROP_FPS)
|
||||
w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||
h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||
else: # stream
|
||||
fps, w, h = 30, im0.shape[1], im0.shape[0]
|
||||
save_path += '.mp4'
|
||||
vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
|
||||
vid_writer.write(im0)
|
||||
|
||||
if save_txt or save_img:
|
||||
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
|
||||
#print(f"Results saved to {save_dir}{s}")
|
||||
|
||||
print(f'Done. ({time.time() - t0:.3f}s)')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--weights', nargs='+', type=str, default='yolov7.pt', help='model.pt path(s)')
|
||||
parser.add_argument('--source', type=str, default='inference/images', help='source') # file/folder, 0 for webcam
|
||||
parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')
|
||||
parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold')
|
||||
parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS')
|
||||
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
|
||||
parser.add_argument('--view-img', action='store_true', help='display results')
|
||||
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
|
||||
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
|
||||
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
|
||||
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3')
|
||||
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
|
||||
parser.add_argument('--augment', action='store_true', help='augmented inference')
|
||||
parser.add_argument('--update', action='store_true', help='update all models')
|
||||
parser.add_argument('--project', default='runs/detect', help='save results to project/name')
|
||||
parser.add_argument('--name', default='exp', help='save results to project/name')
|
||||
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
|
||||
parser.add_argument('--no-trace', action='store_true', help='don`t trace model')
|
||||
opt = parser.parse_args()
|
||||
print(opt)
|
||||
#check_requirements(exclude=('pycocotools', 'thop'))
|
||||
|
||||
with torch.no_grad():
|
||||
if opt.update: # update all models (to fix SourceChangeWarning)
|
||||
for opt.weights in ['yolov7.pt']:
|
||||
detect()
|
||||
strip_optimizer(opt.weights)
|
||||
else:
|
||||
detect()
|
BIN
yolov7-tracker-example/figure/demo.gif
Normal file
BIN
yolov7-tracker-example/figure/demo.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 12 MiB |
BIN
yolov7-tracker-example/figure/horses_prediction.jpg
Normal file
BIN
yolov7-tracker-example/figure/horses_prediction.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 151 KiB |
BIN
yolov7-tracker-example/figure/mask.png
Normal file
BIN
yolov7-tracker-example/figure/mask.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 102 KiB |
BIN
yolov7-tracker-example/figure/performance.png
Normal file
BIN
yolov7-tracker-example/figure/performance.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 164 KiB |
BIN
yolov7-tracker-example/figure/pose.png
Normal file
BIN
yolov7-tracker-example/figure/pose.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 347 KiB |
97
yolov7-tracker-example/hubconf.py
Normal file
97
yolov7-tracker-example/hubconf.py
Normal file
@ -0,0 +1,97 @@
|
||||
"""PyTorch Hub models
|
||||
|
||||
Usage:
|
||||
import torch
|
||||
model = torch.hub.load('repo', 'model')
|
||||
"""
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
import torch
|
||||
|
||||
from models.yolo import Model
|
||||
from utils.general import check_requirements, set_logging
|
||||
from utils.google_utils import attempt_download
|
||||
from utils.torch_utils import select_device
|
||||
|
||||
dependencies = ['torch', 'yaml']
|
||||
check_requirements(Path(__file__).parent / 'requirements.txt', exclude=('pycocotools', 'thop'))
|
||||
set_logging()
|
||||
|
||||
|
||||
def create(name, pretrained, channels, classes, autoshape):
|
||||
"""Creates a specified model
|
||||
|
||||
Arguments:
|
||||
name (str): name of model, i.e. 'yolov7'
|
||||
pretrained (bool): load pretrained weights into the model
|
||||
channels (int): number of input channels
|
||||
classes (int): number of model classes
|
||||
|
||||
Returns:
|
||||
pytorch model
|
||||
"""
|
||||
try:
|
||||
cfg = list((Path(__file__).parent / 'cfg').rglob(f'{name}.yaml'))[0] # model.yaml path
|
||||
model = Model(cfg, channels, classes)
|
||||
if pretrained:
|
||||
fname = f'{name}.pt' # checkpoint filename
|
||||
attempt_download(fname) # download if not found locally
|
||||
ckpt = torch.load(fname, map_location=torch.device('cpu')) # load
|
||||
msd = model.state_dict() # model state_dict
|
||||
csd = ckpt['model'].float().state_dict() # checkpoint state_dict as FP32
|
||||
csd = {k: v for k, v in csd.items() if msd[k].shape == v.shape} # filter
|
||||
model.load_state_dict(csd, strict=False) # load
|
||||
if len(ckpt['model'].names) == classes:
|
||||
model.names = ckpt['model'].names # set class names attribute
|
||||
if autoshape:
|
||||
model = model.autoshape() # for file/URI/PIL/cv2/np inputs and NMS
|
||||
device = select_device('0' if torch.cuda.is_available() else 'cpu') # default to GPU if available
|
||||
return model.to(device)
|
||||
|
||||
except Exception as e:
|
||||
s = 'Cache maybe be out of date, try force_reload=True.'
|
||||
raise Exception(s) from e
|
||||
|
||||
|
||||
def custom(path_or_model='path/to/model.pt', autoshape=True):
|
||||
"""custom mode
|
||||
|
||||
Arguments (3 options):
|
||||
path_or_model (str): 'path/to/model.pt'
|
||||
path_or_model (dict): torch.load('path/to/model.pt')
|
||||
path_or_model (nn.Module): torch.load('path/to/model.pt')['model']
|
||||
|
||||
Returns:
|
||||
pytorch model
|
||||
"""
|
||||
model = torch.load(path_or_model) if isinstance(path_or_model, str) else path_or_model # load checkpoint
|
||||
if isinstance(model, dict):
|
||||
model = model['ema' if model.get('ema') else 'model'] # load model
|
||||
|
||||
hub_model = Model(model.yaml).to(next(model.parameters()).device) # create
|
||||
hub_model.load_state_dict(model.float().state_dict()) # load state_dict
|
||||
hub_model.names = model.names # class names
|
||||
if autoshape:
|
||||
hub_model = hub_model.autoshape() # for file/URI/PIL/cv2/np inputs and NMS
|
||||
device = select_device('0' if torch.cuda.is_available() else 'cpu') # default to GPU if available
|
||||
return hub_model.to(device)
|
||||
|
||||
|
||||
def yolov7(pretrained=True, channels=3, classes=80, autoshape=True):
|
||||
return create('yolov7', pretrained, channels, classes, autoshape)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
model = custom(path_or_model='yolov7.pt') # custom example
|
||||
# model = create(name='yolov7', pretrained=True, channels=3, classes=80, autoshape=True) # pretrained example
|
||||
|
||||
# Verify inference
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
|
||||
imgs = [np.zeros((640, 480, 3))]
|
||||
|
||||
results = model(imgs) # batched inference
|
||||
results.print()
|
||||
results.save()
|
BIN
yolov7-tracker-example/inference/images/horses.jpg
Normal file
BIN
yolov7-tracker-example/inference/images/horses.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 130 KiB |
1
yolov7-tracker-example/models/__init__.py
Normal file
1
yolov7-tracker-example/models/__init__.py
Normal file
@ -0,0 +1 @@
|
||||
# init
|
2019
yolov7-tracker-example/models/common.py
Normal file
2019
yolov7-tracker-example/models/common.py
Normal file
File diff suppressed because it is too large
Load Diff
106
yolov7-tracker-example/models/experimental.py
Normal file
106
yolov7-tracker-example/models/experimental.py
Normal file
@ -0,0 +1,106 @@
|
||||
import numpy as np
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
|
||||
from models.common import Conv, DWConv
|
||||
from utils.google_utils import attempt_download
|
||||
|
||||
|
||||
class CrossConv(nn.Module):
|
||||
# Cross Convolution Downsample
|
||||
def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False):
|
||||
# ch_in, ch_out, kernel, stride, groups, expansion, shortcut
|
||||
super(CrossConv, self).__init__()
|
||||
c_ = int(c2 * e) # hidden channels
|
||||
self.cv1 = Conv(c1, c_, (1, k), (1, s))
|
||||
self.cv2 = Conv(c_, c2, (k, 1), (s, 1), g=g)
|
||||
self.add = shortcut and c1 == c2
|
||||
|
||||
def forward(self, x):
|
||||
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
|
||||
|
||||
|
||||
class Sum(nn.Module):
|
||||
# Weighted sum of 2 or more layers https://arxiv.org/abs/1911.09070
|
||||
def __init__(self, n, weight=False): # n: number of inputs
|
||||
super(Sum, self).__init__()
|
||||
self.weight = weight # apply weights boolean
|
||||
self.iter = range(n - 1) # iter object
|
||||
if weight:
|
||||
self.w = nn.Parameter(-torch.arange(1., n) / 2, requires_grad=True) # layer weights
|
||||
|
||||
def forward(self, x):
|
||||
y = x[0] # no weight
|
||||
if self.weight:
|
||||
w = torch.sigmoid(self.w) * 2
|
||||
for i in self.iter:
|
||||
y = y + x[i + 1] * w[i]
|
||||
else:
|
||||
for i in self.iter:
|
||||
y = y + x[i + 1]
|
||||
return y
|
||||
|
||||
|
||||
class MixConv2d(nn.Module):
|
||||
# Mixed Depthwise Conv https://arxiv.org/abs/1907.09595
|
||||
def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True):
|
||||
super(MixConv2d, self).__init__()
|
||||
groups = len(k)
|
||||
if equal_ch: # equal c_ per group
|
||||
i = torch.linspace(0, groups - 1E-6, c2).floor() # c2 indices
|
||||
c_ = [(i == g).sum() for g in range(groups)] # intermediate channels
|
||||
else: # equal weight.numel() per group
|
||||
b = [c2] + [0] * groups
|
||||
a = np.eye(groups + 1, groups, k=-1)
|
||||
a -= np.roll(a, 1, axis=1)
|
||||
a *= np.array(k) ** 2
|
||||
a[0] = 1
|
||||
c_ = np.linalg.lstsq(a, b, rcond=None)[0].round() # solve for equal weight indices, ax = b
|
||||
|
||||
self.m = nn.ModuleList([nn.Conv2d(c1, int(c_[g]), k[g], s, k[g] // 2, bias=False) for g in range(groups)])
|
||||
self.bn = nn.BatchNorm2d(c2)
|
||||
self.act = nn.LeakyReLU(0.1, inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
return x + self.act(self.bn(torch.cat([m(x) for m in self.m], 1)))
|
||||
|
||||
|
||||
class Ensemble(nn.ModuleList):
|
||||
# Ensemble of models
|
||||
def __init__(self):
|
||||
super(Ensemble, self).__init__()
|
||||
|
||||
def forward(self, x, augment=False):
|
||||
y = []
|
||||
for module in self:
|
||||
y.append(module(x, augment)[0])
|
||||
# y = torch.stack(y).max(0)[0] # max ensemble
|
||||
# y = torch.stack(y).mean(0) # mean ensemble
|
||||
y = torch.cat(y, 1) # nms ensemble
|
||||
return y, None # inference, train output
|
||||
|
||||
|
||||
def attempt_load(weights, map_location=None):
|
||||
# Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
|
||||
model = Ensemble()
|
||||
for w in weights if isinstance(weights, list) else [weights]:
|
||||
# attempt_download(w)
|
||||
ckpt = torch.load(w, map_location=map_location) # load
|
||||
model.append(ckpt['ema' if ckpt.get('ema') else 'model'].float().fuse().eval()) # FP32 model
|
||||
|
||||
# Compatibility updates
|
||||
for m in model.modules():
|
||||
if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU]:
|
||||
m.inplace = True # pytorch 1.7.0 compatibility
|
||||
elif type(m) is nn.Upsample:
|
||||
m.recompute_scale_factor = None # torch 1.11.0 compatibility
|
||||
elif type(m) is Conv:
|
||||
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
|
||||
|
||||
if len(model) == 1:
|
||||
return model[-1] # return model
|
||||
else:
|
||||
print('Ensemble created with %s\n' % weights)
|
||||
for k in ['names', 'stride']:
|
||||
setattr(model, k, getattr(model[-1], k))
|
||||
return model # return ensemble
|
98
yolov7-tracker-example/models/export.py
Normal file
98
yolov7-tracker-example/models/export.py
Normal file
@ -0,0 +1,98 @@
|
||||
import argparse
|
||||
import sys
|
||||
import time
|
||||
|
||||
sys.path.append('./') # to run '$ python *.py' files in subdirectories
|
||||
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
|
||||
import models
|
||||
from models.experimental import attempt_load
|
||||
from utils.activations import Hardswish, SiLU
|
||||
from utils.general import set_logging, check_img_size
|
||||
from utils.torch_utils import select_device
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--weights', type=str, default='./yolor-csp-c.pt', help='weights path')
|
||||
parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='image size') # height, width
|
||||
parser.add_argument('--batch-size', type=int, default=1, help='batch size')
|
||||
parser.add_argument('--dynamic', action='store_true', help='dynamic ONNX axes')
|
||||
parser.add_argument('--grid', action='store_true', help='export Detect() layer grid')
|
||||
parser.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
|
||||
opt = parser.parse_args()
|
||||
opt.img_size *= 2 if len(opt.img_size) == 1 else 1 # expand
|
||||
print(opt)
|
||||
set_logging()
|
||||
t = time.time()
|
||||
|
||||
# Load PyTorch model
|
||||
device = select_device(opt.device)
|
||||
model = attempt_load(opt.weights, map_location=device) # load FP32 model
|
||||
labels = model.names
|
||||
|
||||
# Checks
|
||||
gs = int(max(model.stride)) # grid size (max stride)
|
||||
opt.img_size = [check_img_size(x, gs) for x in opt.img_size] # verify img_size are gs-multiples
|
||||
|
||||
# Input
|
||||
img = torch.zeros(opt.batch_size, 3, *opt.img_size).to(device) # image size(1,3,320,192) iDetection
|
||||
|
||||
# Update model
|
||||
for k, m in model.named_modules():
|
||||
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
|
||||
if isinstance(m, models.common.Conv): # assign export-friendly activations
|
||||
if isinstance(m.act, nn.Hardswish):
|
||||
m.act = Hardswish()
|
||||
elif isinstance(m.act, nn.SiLU):
|
||||
m.act = SiLU()
|
||||
# elif isinstance(m, models.yolo.Detect):
|
||||
# m.forward = m.forward_export # assign forward (optional)
|
||||
model.model[-1].export = not opt.grid # set Detect() layer grid export
|
||||
y = model(img) # dry run
|
||||
|
||||
# TorchScript export
|
||||
try:
|
||||
print('\nStarting TorchScript export with torch %s...' % torch.__version__)
|
||||
f = opt.weights.replace('.pt', '.torchscript.pt') # filename
|
||||
ts = torch.jit.trace(model, img, strict=False)
|
||||
ts.save(f)
|
||||
print('TorchScript export success, saved as %s' % f)
|
||||
except Exception as e:
|
||||
print('TorchScript export failure: %s' % e)
|
||||
|
||||
# ONNX export
|
||||
try:
|
||||
import onnx
|
||||
|
||||
print('\nStarting ONNX export with onnx %s...' % onnx.__version__)
|
||||
f = opt.weights.replace('.pt', '.onnx') # filename
|
||||
torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=['images'],
|
||||
output_names=['classes', 'boxes'] if y is None else ['output'],
|
||||
dynamic_axes={'images': {0: 'batch', 2: 'height', 3: 'width'}, # size(1,3,640,640)
|
||||
'output': {0: 'batch', 2: 'y', 3: 'x'}} if opt.dynamic else None)
|
||||
|
||||
# Checks
|
||||
onnx_model = onnx.load(f) # load onnx model
|
||||
onnx.checker.check_model(onnx_model) # check onnx model
|
||||
# print(onnx.helper.printable_graph(onnx_model.graph)) # print a human readable model
|
||||
print('ONNX export success, saved as %s' % f)
|
||||
except Exception as e:
|
||||
print('ONNX export failure: %s' % e)
|
||||
|
||||
# CoreML export
|
||||
try:
|
||||
import coremltools as ct
|
||||
|
||||
print('\nStarting CoreML export with coremltools %s...' % ct.__version__)
|
||||
# convert model from torchscript and apply pixel scaling as per detect.py
|
||||
model = ct.convert(ts, inputs=[ct.ImageType(name='image', shape=img.shape, scale=1 / 255.0, bias=[0, 0, 0])])
|
||||
f = opt.weights.replace('.pt', '.mlmodel') # filename
|
||||
model.save(f)
|
||||
print('CoreML export success, saved as %s' % f)
|
||||
except Exception as e:
|
||||
print('CoreML export failure: %s' % e)
|
||||
|
||||
# Finish
|
||||
print('\nExport complete (%.2fs). Visualize with https://github.com/lutzroeder/netron.' % (time.time() - t))
|
550
yolov7-tracker-example/models/yolo.py
Normal file
550
yolov7-tracker-example/models/yolo.py
Normal file
@ -0,0 +1,550 @@
|
||||
import argparse
|
||||
import logging
|
||||
import sys
|
||||
from copy import deepcopy
|
||||
|
||||
sys.path.append('./') # to run '$ python *.py' files in subdirectories
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from models.common import *
|
||||
from models.experimental import *
|
||||
from utils.autoanchor import check_anchor_order
|
||||
from utils.general import make_divisible, check_file, set_logging
|
||||
from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info, scale_img, initialize_weights, \
|
||||
select_device, copy_attr
|
||||
from utils.loss import SigmoidBin
|
||||
|
||||
try:
|
||||
import thop # for FLOPS computation
|
||||
except ImportError:
|
||||
thop = None
|
||||
|
||||
|
||||
class Detect(nn.Module):
|
||||
stride = None # strides computed during build
|
||||
export = False # onnx export
|
||||
|
||||
def __init__(self, nc=80, anchors=(), ch=()): # detection layer
|
||||
super(Detect, self).__init__()
|
||||
self.nc = nc # number of classes
|
||||
self.no = nc + 5 # number of outputs per anchor
|
||||
self.nl = len(anchors) # number of detection layers
|
||||
self.na = len(anchors[0]) // 2 # number of anchors
|
||||
self.grid = [torch.zeros(1)] * self.nl # init grid
|
||||
a = torch.tensor(anchors).float().view(self.nl, -1, 2)
|
||||
self.register_buffer('anchors', a) # shape(nl,na,2)
|
||||
self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2)
|
||||
self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv
|
||||
|
||||
def forward(self, x):
|
||||
# x = x.copy() # for profiling
|
||||
z = [] # inference output
|
||||
self.training |= self.export
|
||||
for i in range(self.nl):
|
||||
x[i] = self.m[i](x[i]) # conv
|
||||
bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
|
||||
x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
|
||||
|
||||
if not self.training: # inference
|
||||
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
|
||||
self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
|
||||
|
||||
y = x[i].sigmoid()
|
||||
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy
|
||||
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
|
||||
z.append(y.view(bs, -1, self.no))
|
||||
|
||||
return x if self.training else (torch.cat(z, 1), x)
|
||||
|
||||
@staticmethod
|
||||
def _make_grid(nx=20, ny=20):
|
||||
yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
|
||||
return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
|
||||
|
||||
|
||||
class IDetect(nn.Module):
|
||||
stride = None # strides computed during build
|
||||
export = False # onnx export
|
||||
|
||||
def __init__(self, nc=80, anchors=(), ch=()): # detection layer
|
||||
super(IDetect, self).__init__()
|
||||
self.nc = nc # number of classes
|
||||
self.no = nc + 5 # number of outputs per anchor
|
||||
self.nl = len(anchors) # number of detection layers
|
||||
self.na = len(anchors[0]) // 2 # number of anchors
|
||||
self.grid = [torch.zeros(1)] * self.nl # init grid
|
||||
a = torch.tensor(anchors).float().view(self.nl, -1, 2)
|
||||
self.register_buffer('anchors', a) # shape(nl,na,2)
|
||||
self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2)
|
||||
self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv
|
||||
|
||||
self.ia = nn.ModuleList(ImplicitA(x) for x in ch)
|
||||
self.im = nn.ModuleList(ImplicitM(self.no * self.na) for _ in ch)
|
||||
|
||||
def forward(self, x):
|
||||
# x = x.copy() # for profiling
|
||||
z = [] # inference output
|
||||
self.training |= self.export
|
||||
for i in range(self.nl):
|
||||
x[i] = self.m[i](self.ia[i](x[i])) # conv
|
||||
x[i] = self.im[i](x[i])
|
||||
bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
|
||||
x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
|
||||
|
||||
if not self.training: # inference
|
||||
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
|
||||
self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
|
||||
|
||||
y = x[i].sigmoid()
|
||||
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy
|
||||
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
|
||||
z.append(y.view(bs, -1, self.no))
|
||||
|
||||
return x if self.training else (torch.cat(z, 1), x)
|
||||
|
||||
@staticmethod
|
||||
def _make_grid(nx=20, ny=20):
|
||||
yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
|
||||
return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
|
||||
|
||||
|
||||
class IAuxDetect(nn.Module):
|
||||
stride = None # strides computed during build
|
||||
export = False # onnx export
|
||||
|
||||
def __init__(self, nc=80, anchors=(), ch=()): # detection layer
|
||||
super(IAuxDetect, self).__init__()
|
||||
self.nc = nc # number of classes
|
||||
self.no = nc + 5 # number of outputs per anchor
|
||||
self.nl = len(anchors) # number of detection layers
|
||||
self.na = len(anchors[0]) // 2 # number of anchors
|
||||
self.grid = [torch.zeros(1)] * self.nl # init grid
|
||||
a = torch.tensor(anchors).float().view(self.nl, -1, 2)
|
||||
self.register_buffer('anchors', a) # shape(nl,na,2)
|
||||
self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2)
|
||||
self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch[:self.nl]) # output conv
|
||||
self.m2 = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch[self.nl:]) # output conv
|
||||
|
||||
self.ia = nn.ModuleList(ImplicitA(x) for x in ch[:self.nl])
|
||||
self.im = nn.ModuleList(ImplicitM(self.no * self.na) for _ in ch[:self.nl])
|
||||
|
||||
def forward(self, x):
|
||||
# x = x.copy() # for profiling
|
||||
z = [] # inference output
|
||||
self.training |= self.export
|
||||
for i in range(self.nl):
|
||||
x[i] = self.m[i](self.ia[i](x[i])) # conv
|
||||
x[i] = self.im[i](x[i])
|
||||
bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
|
||||
x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
|
||||
|
||||
x[i+self.nl] = self.m2[i](x[i+self.nl])
|
||||
x[i+self.nl] = x[i+self.nl].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
|
||||
|
||||
if not self.training: # inference
|
||||
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
|
||||
self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
|
||||
|
||||
y = x[i].sigmoid()
|
||||
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy
|
||||
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
|
||||
z.append(y.view(bs, -1, self.no))
|
||||
|
||||
return x if self.training else (torch.cat(z, 1), x[:self.nl])
|
||||
|
||||
@staticmethod
|
||||
def _make_grid(nx=20, ny=20):
|
||||
yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
|
||||
return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
|
||||
|
||||
|
||||
class IBin(nn.Module):
|
||||
stride = None # strides computed during build
|
||||
export = False # onnx export
|
||||
|
||||
def __init__(self, nc=80, anchors=(), ch=(), bin_count=21): # detection layer
|
||||
super(IBin, self).__init__()
|
||||
self.nc = nc # number of classes
|
||||
self.bin_count = bin_count
|
||||
|
||||
self.w_bin_sigmoid = SigmoidBin(bin_count=self.bin_count, min=0.0, max=4.0)
|
||||
self.h_bin_sigmoid = SigmoidBin(bin_count=self.bin_count, min=0.0, max=4.0)
|
||||
# classes, x,y,obj
|
||||
self.no = nc + 3 + \
|
||||
self.w_bin_sigmoid.get_length() + self.h_bin_sigmoid.get_length() # w-bce, h-bce
|
||||
# + self.x_bin_sigmoid.get_length() + self.y_bin_sigmoid.get_length()
|
||||
|
||||
self.nl = len(anchors) # number of detection layers
|
||||
self.na = len(anchors[0]) // 2 # number of anchors
|
||||
self.grid = [torch.zeros(1)] * self.nl # init grid
|
||||
a = torch.tensor(anchors).float().view(self.nl, -1, 2)
|
||||
self.register_buffer('anchors', a) # shape(nl,na,2)
|
||||
self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2)
|
||||
self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv
|
||||
|
||||
self.ia = nn.ModuleList(ImplicitA(x) for x in ch)
|
||||
self.im = nn.ModuleList(ImplicitM(self.no * self.na) for _ in ch)
|
||||
|
||||
def forward(self, x):
|
||||
|
||||
#self.x_bin_sigmoid.use_fw_regression = True
|
||||
#self.y_bin_sigmoid.use_fw_regression = True
|
||||
self.w_bin_sigmoid.use_fw_regression = True
|
||||
self.h_bin_sigmoid.use_fw_regression = True
|
||||
|
||||
# x = x.copy() # for profiling
|
||||
z = [] # inference output
|
||||
self.training |= self.export
|
||||
for i in range(self.nl):
|
||||
x[i] = self.m[i](self.ia[i](x[i])) # conv
|
||||
x[i] = self.im[i](x[i])
|
||||
bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
|
||||
x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
|
||||
|
||||
if not self.training: # inference
|
||||
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
|
||||
self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
|
||||
|
||||
y = x[i].sigmoid()
|
||||
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy
|
||||
#y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
|
||||
|
||||
|
||||
#px = (self.x_bin_sigmoid.forward(y[..., 0:12]) + self.grid[i][..., 0]) * self.stride[i]
|
||||
#py = (self.y_bin_sigmoid.forward(y[..., 12:24]) + self.grid[i][..., 1]) * self.stride[i]
|
||||
|
||||
pw = self.w_bin_sigmoid.forward(y[..., 2:24]) * self.anchor_grid[i][..., 0]
|
||||
ph = self.h_bin_sigmoid.forward(y[..., 24:46]) * self.anchor_grid[i][..., 1]
|
||||
|
||||
#y[..., 0] = px
|
||||
#y[..., 1] = py
|
||||
y[..., 2] = pw
|
||||
y[..., 3] = ph
|
||||
|
||||
y = torch.cat((y[..., 0:4], y[..., 46:]), dim=-1)
|
||||
|
||||
z.append(y.view(bs, -1, y.shape[-1]))
|
||||
|
||||
return x if self.training else (torch.cat(z, 1), x)
|
||||
|
||||
@staticmethod
|
||||
def _make_grid(nx=20, ny=20):
|
||||
yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
|
||||
return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
|
||||
|
||||
|
||||
class Model(nn.Module):
|
||||
def __init__(self, cfg='yolor-csp-c.yaml', ch=3, nc=None, anchors=None): # model, input channels, number of classes
|
||||
super(Model, self).__init__()
|
||||
self.traced = False
|
||||
if isinstance(cfg, dict):
|
||||
self.yaml = cfg # model dict
|
||||
else: # is *.yaml
|
||||
import yaml # for torch hub
|
||||
self.yaml_file = Path(cfg).name
|
||||
with open(cfg) as f:
|
||||
self.yaml = yaml.load(f, Loader=yaml.SafeLoader) # model dict
|
||||
|
||||
# Define model
|
||||
ch = self.yaml['ch'] = self.yaml.get('ch', ch) # input channels
|
||||
if nc and nc != self.yaml['nc']:
|
||||
logger.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}")
|
||||
self.yaml['nc'] = nc # override yaml value
|
||||
if anchors:
|
||||
logger.info(f'Overriding model.yaml anchors with anchors={anchors}')
|
||||
self.yaml['anchors'] = round(anchors) # override yaml value
|
||||
self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist
|
||||
self.names = [str(i) for i in range(self.yaml['nc'])] # default names
|
||||
# print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])
|
||||
|
||||
# Build strides, anchors
|
||||
m = self.model[-1] # Detect()
|
||||
if isinstance(m, Detect):
|
||||
s = 256 # 2x min stride
|
||||
m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward
|
||||
m.anchors /= m.stride.view(-1, 1, 1)
|
||||
check_anchor_order(m)
|
||||
self.stride = m.stride
|
||||
self._initialize_biases() # only run once
|
||||
# print('Strides: %s' % m.stride.tolist())
|
||||
if isinstance(m, IDetect):
|
||||
s = 256 # 2x min stride
|
||||
m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward
|
||||
m.anchors /= m.stride.view(-1, 1, 1)
|
||||
check_anchor_order(m)
|
||||
self.stride = m.stride
|
||||
self._initialize_biases() # only run once
|
||||
# print('Strides: %s' % m.stride.tolist())
|
||||
if isinstance(m, IAuxDetect):
|
||||
s = 256 # 2x min stride
|
||||
m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))[:4]]) # forward
|
||||
#print(m.stride)
|
||||
m.anchors /= m.stride.view(-1, 1, 1)
|
||||
check_anchor_order(m)
|
||||
self.stride = m.stride
|
||||
self._initialize_aux_biases() # only run once
|
||||
# print('Strides: %s' % m.stride.tolist())
|
||||
if isinstance(m, IBin):
|
||||
s = 256 # 2x min stride
|
||||
m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward
|
||||
m.anchors /= m.stride.view(-1, 1, 1)
|
||||
check_anchor_order(m)
|
||||
self.stride = m.stride
|
||||
self._initialize_biases_bin() # only run once
|
||||
# print('Strides: %s' % m.stride.tolist())
|
||||
|
||||
# Init weights, biases
|
||||
initialize_weights(self)
|
||||
self.info()
|
||||
logger.info('')
|
||||
|
||||
def forward(self, x, augment=False, profile=False):
|
||||
if augment:
|
||||
img_size = x.shape[-2:] # height, width
|
||||
s = [1, 0.83, 0.67] # scales
|
||||
f = [None, 3, None] # flips (2-ud, 3-lr)
|
||||
y = [] # outputs
|
||||
for si, fi in zip(s, f):
|
||||
xi = scale_img(x.flip(fi) if fi else x, si, gs=int(self.stride.max()))
|
||||
yi = self.forward_once(xi)[0] # forward
|
||||
# cv2.imwrite(f'img_{si}.jpg', 255 * xi[0].cpu().numpy().transpose((1, 2, 0))[:, :, ::-1]) # save
|
||||
yi[..., :4] /= si # de-scale
|
||||
if fi == 2:
|
||||
yi[..., 1] = img_size[0] - yi[..., 1] # de-flip ud
|
||||
elif fi == 3:
|
||||
yi[..., 0] = img_size[1] - yi[..., 0] # de-flip lr
|
||||
y.append(yi)
|
||||
return torch.cat(y, 1), None # augmented inference, train
|
||||
else:
|
||||
return self.forward_once(x, profile) # single-scale inference, train
|
||||
|
||||
def forward_once(self, x, profile=False):
|
||||
y, dt = [], [] # outputs
|
||||
for m in self.model:
|
||||
if m.f != -1: # if not from previous layer
|
||||
x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f] # from earlier layers
|
||||
|
||||
if not hasattr(self, 'traced'):
|
||||
self.traced=False
|
||||
|
||||
if self.traced:
|
||||
if isinstance(m, Detect) or isinstance(m, IDetect) or isinstance(m, IAuxDetect):
|
||||
break
|
||||
|
||||
if profile:
|
||||
c = isinstance(m, (Detect, IDetect, IAuxDetect, IBin))
|
||||
o = thop.profile(m, inputs=(x.copy() if c else x,), verbose=False)[0] / 1E9 * 2 if thop else 0 # FLOPS
|
||||
for _ in range(10):
|
||||
m(x.copy() if c else x)
|
||||
t = time_synchronized()
|
||||
for _ in range(10):
|
||||
m(x.copy() if c else x)
|
||||
dt.append((time_synchronized() - t) * 100)
|
||||
print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))
|
||||
|
||||
x = m(x) # run
|
||||
|
||||
y.append(x if m.i in self.save else None) # save output
|
||||
|
||||
if profile:
|
||||
print('%.1fms total' % sum(dt))
|
||||
return x
|
||||
|
||||
def _initialize_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency
|
||||
# https://arxiv.org/abs/1708.02002 section 3.3
|
||||
# cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
|
||||
m = self.model[-1] # Detect() module
|
||||
for mi, s in zip(m.m, m.stride): # from
|
||||
b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85)
|
||||
b.data[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
|
||||
b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
|
||||
mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
|
||||
|
||||
def _initialize_aux_biases(self, cf=None): # initialize biases into Detect(), cf is class frequency
|
||||
# https://arxiv.org/abs/1708.02002 section 3.3
|
||||
# cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
|
||||
m = self.model[-1] # Detect() module
|
||||
for mi, mi2, s in zip(m.m, m.m2, m.stride): # from
|
||||
b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85)
|
||||
b.data[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
|
||||
b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
|
||||
mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
|
||||
b2 = mi2.bias.view(m.na, -1) # conv.bias(255) to (3,85)
|
||||
b2.data[:, 4] += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
|
||||
b2.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
|
||||
mi2.bias = torch.nn.Parameter(b2.view(-1), requires_grad=True)
|
||||
|
||||
def _initialize_biases_bin(self, cf=None): # initialize biases into Detect(), cf is class frequency
|
||||
# https://arxiv.org/abs/1708.02002 section 3.3
|
||||
# cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
|
||||
m = self.model[-1] # Bin() module
|
||||
bc = m.bin_count
|
||||
for mi, s in zip(m.m, m.stride): # from
|
||||
b = mi.bias.view(m.na, -1) # conv.bias(255) to (3,85)
|
||||
old = b[:, (0,1,2,bc+3)].data
|
||||
obj_idx = 2*bc+4
|
||||
b[:, :obj_idx].data += math.log(0.6 / (bc + 1 - 0.99))
|
||||
b[:, obj_idx].data += math.log(8 / (640 / s) ** 2) # obj (8 objects per 640 image)
|
||||
b[:, (obj_idx+1):].data += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum()) # cls
|
||||
b[:, (0,1,2,bc+3)].data = old
|
||||
mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
|
||||
|
||||
def _print_biases(self):
|
||||
m = self.model[-1] # Detect() module
|
||||
for mi in m.m: # from
|
||||
b = mi.bias.detach().view(m.na, -1).T # conv.bias(255) to (3,85)
|
||||
print(('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean()))
|
||||
|
||||
# def _print_weights(self):
|
||||
# for m in self.model.modules():
|
||||
# if type(m) is Bottleneck:
|
||||
# print('%10.3g' % (m.w.detach().sigmoid() * 2)) # shortcut weights
|
||||
|
||||
def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
|
||||
print('Fusing layers... ')
|
||||
for m in self.model.modules():
|
||||
if isinstance(m, RepConv):
|
||||
#print(f" fuse_repvgg_block")
|
||||
m.fuse_repvgg_block()
|
||||
elif isinstance(m, RepConv_OREPA):
|
||||
#print(f" switch_to_deploy")
|
||||
m.switch_to_deploy()
|
||||
elif type(m) is Conv and hasattr(m, 'bn'):
|
||||
m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
|
||||
delattr(m, 'bn') # remove batchnorm
|
||||
m.forward = m.fuseforward # update forward
|
||||
self.info()
|
||||
return self
|
||||
|
||||
def nms(self, mode=True): # add or remove NMS module
|
||||
present = type(self.model[-1]) is NMS # last layer is NMS
|
||||
if mode and not present:
|
||||
print('Adding NMS... ')
|
||||
m = NMS() # module
|
||||
m.f = -1 # from
|
||||
m.i = self.model[-1].i + 1 # index
|
||||
self.model.add_module(name='%s' % m.i, module=m) # add
|
||||
self.eval()
|
||||
elif not mode and present:
|
||||
print('Removing NMS... ')
|
||||
self.model = self.model[:-1] # remove
|
||||
return self
|
||||
|
||||
def autoshape(self): # add autoShape module
|
||||
print('Adding autoShape... ')
|
||||
m = autoShape(self) # wrap model
|
||||
copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=()) # copy attributes
|
||||
return m
|
||||
|
||||
def info(self, verbose=False, img_size=640): # print model information
|
||||
model_info(self, verbose, img_size)
|
||||
|
||||
|
||||
def parse_model(d, ch): # model_dict, input_channels(3)
|
||||
logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
|
||||
anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']
|
||||
na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors
|
||||
no = na * (nc + 5) # number of outputs = anchors * (classes + 5)
|
||||
|
||||
layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out
|
||||
for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args
|
||||
m = eval(m) if isinstance(m, str) else m # eval strings
|
||||
for j, a in enumerate(args):
|
||||
try:
|
||||
args[j] = eval(a) if isinstance(a, str) else a # eval strings
|
||||
except:
|
||||
pass
|
||||
|
||||
n = max(round(n * gd), 1) if n > 1 else n # depth gain
|
||||
if m in [nn.Conv2d, Conv, RobustConv, RobustConv2, DWConv, GhostConv, RepConv, RepConv_OREPA, DownC,
|
||||
SPP, SPPF, SPPCSPC, GhostSPPCSPC, MixConv2d, Focus, Stem, GhostStem, CrossConv,
|
||||
Bottleneck, BottleneckCSPA, BottleneckCSPB, BottleneckCSPC,
|
||||
RepBottleneck, RepBottleneckCSPA, RepBottleneckCSPB, RepBottleneckCSPC,
|
||||
Res, ResCSPA, ResCSPB, ResCSPC,
|
||||
RepRes, RepResCSPA, RepResCSPB, RepResCSPC,
|
||||
ResX, ResXCSPA, ResXCSPB, ResXCSPC,
|
||||
RepResX, RepResXCSPA, RepResXCSPB, RepResXCSPC,
|
||||
Ghost, GhostCSPA, GhostCSPB, GhostCSPC,
|
||||
SwinTransformerBlock, STCSPA, STCSPB, STCSPC,
|
||||
SwinTransformer2Block, ST2CSPA, ST2CSPB, ST2CSPC]:
|
||||
c1, c2 = ch[f], args[0]
|
||||
if c2 != no: # if not output
|
||||
c2 = make_divisible(c2 * gw, 8)
|
||||
|
||||
args = [c1, c2, *args[1:]]
|
||||
if m in [DownC, SPPCSPC, GhostSPPCSPC,
|
||||
BottleneckCSPA, BottleneckCSPB, BottleneckCSPC,
|
||||
RepBottleneckCSPA, RepBottleneckCSPB, RepBottleneckCSPC,
|
||||
ResCSPA, ResCSPB, ResCSPC,
|
||||
RepResCSPA, RepResCSPB, RepResCSPC,
|
||||
ResXCSPA, ResXCSPB, ResXCSPC,
|
||||
RepResXCSPA, RepResXCSPB, RepResXCSPC,
|
||||
GhostCSPA, GhostCSPB, GhostCSPC,
|
||||
STCSPA, STCSPB, STCSPC,
|
||||
ST2CSPA, ST2CSPB, ST2CSPC]:
|
||||
args.insert(2, n) # number of repeats
|
||||
n = 1
|
||||
elif m is nn.BatchNorm2d:
|
||||
args = [ch[f]]
|
||||
elif m is Concat:
|
||||
c2 = sum([ch[x] for x in f])
|
||||
elif m is Chuncat:
|
||||
c2 = sum([ch[x] for x in f])
|
||||
elif m is Shortcut:
|
||||
c2 = ch[f[0]]
|
||||
elif m is Foldcut:
|
||||
c2 = ch[f] // 2
|
||||
elif m in [Detect, IDetect, IAuxDetect, IBin]:
|
||||
args.append([ch[x] for x in f])
|
||||
if isinstance(args[1], int): # number of anchors
|
||||
args[1] = [list(range(args[1] * 2))] * len(f)
|
||||
elif m is ReOrg:
|
||||
c2 = ch[f] * 4
|
||||
elif m is Contract:
|
||||
c2 = ch[f] * args[0] ** 2
|
||||
elif m is Expand:
|
||||
c2 = ch[f] // args[0] ** 2
|
||||
else:
|
||||
c2 = ch[f]
|
||||
|
||||
m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module
|
||||
t = str(m)[8:-2].replace('__main__.', '') # module type
|
||||
np = sum([x.numel() for x in m_.parameters()]) # number params
|
||||
m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params
|
||||
logger.info('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n, np, t, args)) # print
|
||||
save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist
|
||||
layers.append(m_)
|
||||
if i == 0:
|
||||
ch = []
|
||||
ch.append(c2)
|
||||
return nn.Sequential(*layers), sorted(save)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--cfg', type=str, default='yolor-csp-c.yaml', help='model.yaml')
|
||||
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
|
||||
parser.add_argument('--profile', action='store_true', help='profile model speed')
|
||||
opt = parser.parse_args()
|
||||
opt.cfg = check_file(opt.cfg) # check file
|
||||
set_logging()
|
||||
device = select_device(opt.device)
|
||||
|
||||
# Create model
|
||||
model = Model(opt.cfg).to(device)
|
||||
model.train()
|
||||
|
||||
if opt.profile:
|
||||
img = torch.rand(1, 3, 640, 640).to(device)
|
||||
y = model(img, profile=True)
|
||||
|
||||
# Profile
|
||||
# img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device)
|
||||
# y = model(img, profile=True)
|
||||
|
||||
# Tensorboard
|
||||
# from torch.utils.tensorboard import SummaryWriter
|
||||
# tb_writer = SummaryWriter()
|
||||
# print("Run 'tensorboard --logdir=models/runs' to view tensorboard at http://localhost:6006/")
|
||||
# tb_writer.add_graph(model.model, img) # add model to tensorboard
|
||||
# tb_writer.add_image('test', img[0], dataformats='CWH') # add model to tensorboard
|
24
yolov7-tracker-example/requirements.txt
Normal file
24
yolov7-tracker-example/requirements.txt
Normal file
@ -0,0 +1,24 @@
|
||||
numpy
|
||||
cython-bbox==0.1.3
|
||||
loguru
|
||||
motmetrics==1.4.0
|
||||
ninja
|
||||
|
||||
pandas
|
||||
Pillow
|
||||
|
||||
PyYAML
|
||||
|
||||
scikit-learn
|
||||
scipy
|
||||
seaborn
|
||||
|
||||
thop
|
||||
tensorboard
|
||||
lap
|
||||
tabulate
|
||||
tqdm
|
||||
|
||||
wandb
|
||||
|
||||
gdown
|
22
yolov7-tracker-example/scripts/get_coco.sh
Normal file
22
yolov7-tracker-example/scripts/get_coco.sh
Normal file
@ -0,0 +1,22 @@
|
||||
#!/bin/bash
|
||||
# COCO 2017 dataset http://cocodataset.org
|
||||
# Download command: bash ./scripts/get_coco.sh
|
||||
|
||||
# Download/unzip labels
|
||||
d='./' # unzip directory
|
||||
url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
|
||||
f='coco2017labels-segments.zip' # or 'coco2017labels.zip', 68 MB
|
||||
echo 'Downloading' $url$f ' ...'
|
||||
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
|
||||
|
||||
# Download/unzip images
|
||||
d='./coco/images' # unzip directory
|
||||
url=http://images.cocodataset.org/zips/
|
||||
f1='train2017.zip' # 19G, 118k images
|
||||
f2='val2017.zip' # 1G, 5k images
|
||||
f3='test2017.zip' # 7G, 41k images (optional)
|
||||
for f in $f1 $f2 $f3; do
|
||||
echo 'Downloading' $url$f '...'
|
||||
curl -L $url$f -o $f && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
|
||||
done
|
||||
wait # finish background tasks
|
350
yolov7-tracker-example/test.py
Normal file
350
yolov7-tracker-example/test.py
Normal file
@ -0,0 +1,350 @@
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
from pathlib import Path
|
||||
from threading import Thread
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import yaml
|
||||
from tqdm import tqdm
|
||||
|
||||
from models.experimental import attempt_load
|
||||
from utils.datasets import create_dataloader
|
||||
from utils.general import coco80_to_coco91_class, check_dataset, check_file, check_img_size, check_requirements, \
|
||||
box_iou, non_max_suppression, scale_coords, xyxy2xywh, xywh2xyxy, set_logging, increment_path, colorstr
|
||||
from utils.metrics import ap_per_class, ConfusionMatrix
|
||||
from utils.plots import plot_images, output_to_target, plot_study_txt
|
||||
from utils.torch_utils import select_device, time_synchronized, TracedModel
|
||||
|
||||
|
||||
def test(data,
|
||||
weights=None,
|
||||
batch_size=32,
|
||||
imgsz=640,
|
||||
conf_thres=0.001,
|
||||
iou_thres=0.6, # for NMS
|
||||
save_json=False,
|
||||
single_cls=False,
|
||||
augment=False,
|
||||
verbose=False,
|
||||
model=None,
|
||||
dataloader=None,
|
||||
save_dir=Path(''), # for saving images
|
||||
save_txt=False, # for auto-labelling
|
||||
save_hybrid=False, # for hybrid auto-labelling
|
||||
save_conf=False, # save auto-label confidences
|
||||
plots=True,
|
||||
wandb_logger=None,
|
||||
compute_loss=None,
|
||||
half_precision=True,
|
||||
trace=False,
|
||||
is_coco=False):
|
||||
# Initialize/load model and set device
|
||||
training = model is not None
|
||||
if training: # called by train.py
|
||||
device = next(model.parameters()).device # get model device
|
||||
|
||||
else: # called directly
|
||||
set_logging()
|
||||
device = select_device(opt.device, batch_size=batch_size)
|
||||
|
||||
# Directories
|
||||
save_dir = Path(increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok)) # increment run
|
||||
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
|
||||
|
||||
# Load model
|
||||
model = attempt_load(weights, map_location=device) # load FP32 model
|
||||
gs = max(int(model.stride.max()), 32) # grid size (max stride)
|
||||
imgsz = check_img_size(imgsz, s=gs) # check img_size
|
||||
|
||||
if trace:
|
||||
model = TracedModel(model, device, opt.img_size)
|
||||
|
||||
# Half
|
||||
half = device.type != 'cpu' and half_precision # half precision only supported on CUDA
|
||||
if half:
|
||||
model.half()
|
||||
|
||||
# Configure
|
||||
model.eval()
|
||||
if isinstance(data, str):
|
||||
is_coco = data.endswith('coco.yaml')
|
||||
with open(data) as f:
|
||||
data = yaml.load(f, Loader=yaml.SafeLoader)
|
||||
check_dataset(data) # check
|
||||
nc = 1 if single_cls else int(data['nc']) # number of classes
|
||||
iouv = torch.linspace(0.5, 0.95, 10).to(device) # iou vector for mAP@0.5:0.95
|
||||
niou = iouv.numel()
|
||||
|
||||
# Logging
|
||||
log_imgs = 0
|
||||
if wandb_logger and wandb_logger.wandb:
|
||||
log_imgs = min(wandb_logger.log_imgs, 100)
|
||||
# Dataloader
|
||||
if not training:
|
||||
if device.type != 'cpu':
|
||||
model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters()))) # run once
|
||||
task = opt.task if opt.task in ('train', 'val', 'test') else 'val' # path to train/val/test images
|
||||
dataloader = create_dataloader(data[task], imgsz, batch_size, gs, opt, pad=0.5, rect=True,
|
||||
prefix=colorstr(f'{task}: '))[0]
|
||||
|
||||
seen = 0
|
||||
|
||||
confusion_matrix = ConfusionMatrix(nc=nc)
|
||||
names = {k: v for k, v in enumerate(model.names if hasattr(model, 'names') else model.module.names)}
|
||||
coco91class = coco80_to_coco91_class()
|
||||
s = ('%20s' + '%12s' * 6) % ('Class', 'Images', 'Labels', 'P', 'R', 'mAP@.5', 'mAP@.5:.95')
|
||||
p, r, f1, mp, mr, map50, map, t0, t1 = 0., 0., 0., 0., 0., 0., 0., 0., 0.
|
||||
loss = torch.zeros(3, device=device)
|
||||
jdict, stats, ap, ap_class, wandb_images = [], [], [], [], []
|
||||
for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)):
|
||||
img = img.to(device, non_blocking=True)
|
||||
img = img.half() if half else img.float() # uint8 to fp16/32
|
||||
img /= 255.0 # 0 - 255 to 0.0 - 1.0
|
||||
targets = targets.to(device)
|
||||
nb, _, height, width = img.shape # batch size, channels, height, width
|
||||
|
||||
with torch.no_grad():
|
||||
# Run model
|
||||
t = time_synchronized()
|
||||
out, train_out = model(img, augment=augment) # inference and training outputs
|
||||
t0 += time_synchronized() - t
|
||||
|
||||
# Compute loss
|
||||
if compute_loss:
|
||||
loss += compute_loss([x.float() for x in train_out], targets)[1][:3] # box, obj, cls
|
||||
|
||||
# Run NMS
|
||||
targets[:, 2:] *= torch.Tensor([width, height, width, height]).to(device) # to pixels
|
||||
lb = [targets[targets[:, 0] == i, 1:] for i in range(nb)] if save_hybrid else [] # for autolabelling
|
||||
t = time_synchronized()
|
||||
out = non_max_suppression(out, conf_thres=conf_thres, iou_thres=iou_thres, labels=lb, multi_label=True)
|
||||
t1 += time_synchronized() - t
|
||||
|
||||
# Statistics per image
|
||||
for si, pred in enumerate(out):
|
||||
labels = targets[targets[:, 0] == si, 1:]
|
||||
nl = len(labels)
|
||||
tcls = labels[:, 0].tolist() if nl else [] # target class
|
||||
path = Path(paths[si])
|
||||
seen += 1
|
||||
|
||||
if len(pred) == 0:
|
||||
if nl:
|
||||
stats.append((torch.zeros(0, niou, dtype=torch.bool), torch.Tensor(), torch.Tensor(), tcls))
|
||||
continue
|
||||
|
||||
# Predictions
|
||||
predn = pred.clone()
|
||||
scale_coords(img[si].shape[1:], predn[:, :4], shapes[si][0], shapes[si][1]) # native-space pred
|
||||
|
||||
# Append to text file
|
||||
if save_txt:
|
||||
gn = torch.tensor(shapes[si][0])[[1, 0, 1, 0]] # normalization gain whwh
|
||||
for *xyxy, conf, cls in predn.tolist():
|
||||
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
|
||||
line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format
|
||||
with open(save_dir / 'labels' / (path.stem + '.txt'), 'a') as f:
|
||||
f.write(('%g ' * len(line)).rstrip() % line + '\n')
|
||||
|
||||
# W&B logging - Media Panel Plots
|
||||
if len(wandb_images) < log_imgs and wandb_logger.current_epoch > 0: # Check for test operation
|
||||
if wandb_logger.current_epoch % wandb_logger.bbox_interval == 0:
|
||||
box_data = [{"position": {"minX": xyxy[0], "minY": xyxy[1], "maxX": xyxy[2], "maxY": xyxy[3]},
|
||||
"class_id": int(cls),
|
||||
"box_caption": "%s %.3f" % (names[cls], conf),
|
||||
"scores": {"class_score": conf},
|
||||
"domain": "pixel"} for *xyxy, conf, cls in pred.tolist()]
|
||||
boxes = {"predictions": {"box_data": box_data, "class_labels": names}} # inference-space
|
||||
wandb_images.append(wandb_logger.wandb.Image(img[si], boxes=boxes, caption=path.name))
|
||||
wandb_logger.log_training_progress(predn, path, names) if wandb_logger and wandb_logger.wandb_run else None
|
||||
|
||||
# Append to pycocotools JSON dictionary
|
||||
if save_json:
|
||||
# [{"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}, ...
|
||||
image_id = int(path.stem) if path.stem.isnumeric() else path.stem
|
||||
box = xyxy2xywh(predn[:, :4]) # xywh
|
||||
box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner
|
||||
for p, b in zip(pred.tolist(), box.tolist()):
|
||||
jdict.append({'image_id': image_id,
|
||||
'category_id': coco91class[int(p[5])] if is_coco else int(p[5]),
|
||||
'bbox': [round(x, 3) for x in b],
|
||||
'score': round(p[4], 5)})
|
||||
|
||||
# Assign all predictions as incorrect
|
||||
correct = torch.zeros(pred.shape[0], niou, dtype=torch.bool, device=device)
|
||||
if nl:
|
||||
detected = [] # target indices
|
||||
tcls_tensor = labels[:, 0]
|
||||
|
||||
# target boxes
|
||||
tbox = xywh2xyxy(labels[:, 1:5])
|
||||
scale_coords(img[si].shape[1:], tbox, shapes[si][0], shapes[si][1]) # native-space labels
|
||||
if plots:
|
||||
confusion_matrix.process_batch(predn, torch.cat((labels[:, 0:1], tbox), 1))
|
||||
|
||||
# Per target class
|
||||
for cls in torch.unique(tcls_tensor):
|
||||
ti = (cls == tcls_tensor).nonzero(as_tuple=False).view(-1) # prediction indices
|
||||
pi = (cls == pred[:, 5]).nonzero(as_tuple=False).view(-1) # target indices
|
||||
|
||||
# Search for detections
|
||||
if pi.shape[0]:
|
||||
# Prediction to target ious
|
||||
ious, i = box_iou(predn[pi, :4], tbox[ti]).max(1) # best ious, indices
|
||||
|
||||
# Append detections
|
||||
detected_set = set()
|
||||
for j in (ious > iouv[0]).nonzero(as_tuple=False):
|
||||
d = ti[i[j]] # detected target
|
||||
if d.item() not in detected_set:
|
||||
detected_set.add(d.item())
|
||||
detected.append(d)
|
||||
correct[pi[j]] = ious[j] > iouv # iou_thres is 1xn
|
||||
if len(detected) == nl: # all targets already located in image
|
||||
break
|
||||
|
||||
# Append statistics (correct, conf, pcls, tcls)
|
||||
stats.append((correct.cpu(), pred[:, 4].cpu(), pred[:, 5].cpu(), tcls))
|
||||
|
||||
# Plot images
|
||||
if plots and batch_i < 3:
|
||||
f = save_dir / f'test_batch{batch_i}_labels.jpg' # labels
|
||||
Thread(target=plot_images, args=(img, targets, paths, f, names), daemon=True).start()
|
||||
f = save_dir / f'test_batch{batch_i}_pred.jpg' # predictions
|
||||
Thread(target=plot_images, args=(img, output_to_target(out), paths, f, names), daemon=True).start()
|
||||
|
||||
# Compute statistics
|
||||
stats = [np.concatenate(x, 0) for x in zip(*stats)] # to numpy
|
||||
if len(stats) and stats[0].any():
|
||||
p, r, ap, f1, ap_class = ap_per_class(*stats, plot=plots, save_dir=save_dir, names=names)
|
||||
ap50, ap = ap[:, 0], ap.mean(1) # AP@0.5, AP@0.5:0.95
|
||||
mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
|
||||
nt = np.bincount(stats[3].astype(np.int64), minlength=nc) # number of targets per class
|
||||
else:
|
||||
nt = torch.zeros(1)
|
||||
|
||||
# Print results
|
||||
pf = '%20s' + '%12i' * 2 + '%12.3g' * 4 # print format
|
||||
print(pf % ('all', seen, nt.sum(), mp, mr, map50, map))
|
||||
|
||||
# Print results per class
|
||||
if (verbose or (nc < 50 and not training)) and nc > 1 and len(stats):
|
||||
for i, c in enumerate(ap_class):
|
||||
print(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i]))
|
||||
|
||||
# Print speeds
|
||||
t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size) # tuple
|
||||
if not training:
|
||||
print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t)
|
||||
|
||||
# Plots
|
||||
if plots:
|
||||
confusion_matrix.plot(save_dir=save_dir, names=list(names.values()))
|
||||
if wandb_logger and wandb_logger.wandb:
|
||||
val_batches = [wandb_logger.wandb.Image(str(f), caption=f.name) for f in sorted(save_dir.glob('test*.jpg'))]
|
||||
wandb_logger.log({"Validation": val_batches})
|
||||
if wandb_images:
|
||||
wandb_logger.log({"Bounding Box Debugger/Images": wandb_images})
|
||||
|
||||
# Save JSON
|
||||
if save_json and len(jdict):
|
||||
w = Path(weights[0] if isinstance(weights, list) else weights).stem if weights is not None else '' # weights
|
||||
anno_json = './coco/annotations/instances_val2017.json' # annotations json
|
||||
pred_json = str(save_dir / f"{w}_predictions.json") # predictions json
|
||||
print('\nEvaluating pycocotools mAP... saving %s...' % pred_json)
|
||||
with open(pred_json, 'w') as f:
|
||||
json.dump(jdict, f)
|
||||
|
||||
try: # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
|
||||
from pycocotools.coco import COCO
|
||||
from pycocotools.cocoeval import COCOeval
|
||||
|
||||
anno = COCO(anno_json) # init annotations api
|
||||
pred = anno.loadRes(pred_json) # init predictions api
|
||||
eval = COCOeval(anno, pred, 'bbox')
|
||||
if is_coco:
|
||||
eval.params.imgIds = [int(Path(x).stem) for x in dataloader.dataset.img_files] # image IDs to evaluate
|
||||
eval.evaluate()
|
||||
eval.accumulate()
|
||||
eval.summarize()
|
||||
map, map50 = eval.stats[:2] # update results (mAP@0.5:0.95, mAP@0.5)
|
||||
except Exception as e:
|
||||
print(f'pycocotools unable to run: {e}')
|
||||
|
||||
# Return results
|
||||
model.float() # for training
|
||||
if not training:
|
||||
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
|
||||
print(f"Results saved to {save_dir}{s}")
|
||||
maps = np.zeros(nc) + map
|
||||
for i, c in enumerate(ap_class):
|
||||
maps[c] = ap[i]
|
||||
return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(prog='test.py')
|
||||
parser.add_argument('--dataset', type=str, default='COCO', help='dataset name')
|
||||
|
||||
parser.add_argument('--weights', nargs='+', type=str, default='yolov7.pt', help='model.pt path(s)')
|
||||
parser.add_argument('--data', type=str, default='data/coco.yaml', help='*.data path')
|
||||
parser.add_argument('--batch-size', type=int, default=32, help='size of each image batch')
|
||||
parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')
|
||||
parser.add_argument('--conf-thres', type=float, default=0.001, help='object confidence threshold')
|
||||
parser.add_argument('--iou-thres', type=float, default=0.65, help='IOU threshold for NMS')
|
||||
parser.add_argument('--task', default='val', help='train, val, test, speed or study')
|
||||
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
|
||||
parser.add_argument('--single-cls', action='store_true', help='treat as single-class dataset')
|
||||
parser.add_argument('--augment', action='store_true', help='augmented inference')
|
||||
parser.add_argument('--verbose', action='store_true', help='report mAP by class')
|
||||
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
|
||||
parser.add_argument('--save-hybrid', action='store_true', help='save label+prediction hybrid results to *.txt')
|
||||
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
|
||||
parser.add_argument('--save-json', action='store_true', help='save a cocoapi-compatible JSON results file')
|
||||
parser.add_argument('--project', default='runs/test', help='save to project/name')
|
||||
parser.add_argument('--name', default='exp', help='save to project/name')
|
||||
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
|
||||
parser.add_argument('--no-trace', action='store_true', help='don`t trace model')
|
||||
opt = parser.parse_args()
|
||||
opt.save_json |= opt.data.endswith('coco.yaml')
|
||||
opt.data = check_file(opt.data) # check file
|
||||
print(opt)
|
||||
#check_requirements()
|
||||
|
||||
if opt.task in ('train', 'val', 'test'): # run normally
|
||||
test(opt.data,
|
||||
opt.weights,
|
||||
opt.batch_size,
|
||||
opt.img_size,
|
||||
opt.conf_thres,
|
||||
opt.iou_thres,
|
||||
opt.save_json,
|
||||
opt.single_cls,
|
||||
opt.augment,
|
||||
opt.verbose,
|
||||
save_txt=opt.save_txt | opt.save_hybrid,
|
||||
save_hybrid=opt.save_hybrid,
|
||||
save_conf=opt.save_conf,
|
||||
trace=not opt.no_trace,
|
||||
)
|
||||
|
||||
elif opt.task == 'speed': # speed benchmarks
|
||||
for w in opt.weights:
|
||||
test(opt.data, w, opt.batch_size, opt.img_size, 0.25, 0.45, save_json=False, plots=False)
|
||||
|
||||
elif opt.task == 'study': # run over a range of settings and save/plot
|
||||
# python test.py --task study --data coco.yaml --iou 0.65 --weights yolov7.pt
|
||||
x = list(range(256, 1536 + 128, 128)) # x axis (image sizes)
|
||||
for w in opt.weights:
|
||||
f = f'study_{Path(opt.data).stem}_{Path(w).stem}.txt' # filename to save to
|
||||
y = [] # y axis
|
||||
for i in x: # img-size
|
||||
print(f'\nRunning {f} point {i}...')
|
||||
r, _, t = test(opt.data, w, opt.batch_size, i, opt.conf_thres, opt.iou_thres, opt.save_json,
|
||||
plots=False)
|
||||
y.append(r + t) # results and times
|
||||
np.savetxt(f, y, fmt='%10.4g') # save
|
||||
os.system('zip -r study.zip study_*.txt')
|
||||
plot_study_txt(x=x) # plot
|
180
yolov7-tracker-example/tools/convert_MOT17_to_yolo.py
Normal file
180
yolov7-tracker-example/tools/convert_MOT17_to_yolo.py
Normal file
@ -0,0 +1,180 @@
|
||||
"""
|
||||
将UAVDT转换为yolo v5格式
|
||||
class_id, xc_norm, yc_norm, w_norm, h_norm
|
||||
"""
|
||||
|
||||
import os
|
||||
import os.path as osp
|
||||
import argparse
|
||||
import cv2
|
||||
import glob
|
||||
import numpy as np
|
||||
import random
|
||||
|
||||
DATA_ROOT = '/data/wujiapeng/datasets/MOT17/'
|
||||
|
||||
image_wh_dict = {} # seq->(w,h) 字典 用于归一化
|
||||
|
||||
def generate_imgs_and_labels(opts):
|
||||
"""
|
||||
产生图片路径的txt文件以及yolo格式真值文件
|
||||
"""
|
||||
if opts.split == 'test':
|
||||
seq_list = os.listdir(osp.join(DATA_ROOT, 'test'))
|
||||
else:
|
||||
seq_list = os.listdir(osp.join(DATA_ROOT, 'train'))
|
||||
seq_list = [item for item in seq_list if 'FRCNN' in item] # 只取一个FRCNN即可
|
||||
if 'val' in opts.split: opts.half = True # 验证集取训练集的一半
|
||||
|
||||
print('--------------------------')
|
||||
print(f'Total {len(seq_list)} seqs!!')
|
||||
print(seq_list)
|
||||
|
||||
if opts.random:
|
||||
random.shuffle(seq_list)
|
||||
|
||||
# 定义类别 MOT只有一类
|
||||
CATEGOTY_ID = 0 # pedestrian
|
||||
|
||||
# 定义帧数范围
|
||||
frame_range = {'start': 0.0, 'end': 1.0}
|
||||
if opts.half: # half 截取一半
|
||||
frame_range['end'] = 0.5
|
||||
|
||||
if opts.split == 'test':
|
||||
process_train_test(seqs=seq_list, frame_range=frame_range, cat_id=CATEGOTY_ID, split='test')
|
||||
else:
|
||||
process_train_test(seqs=seq_list, frame_range=frame_range, cat_id=CATEGOTY_ID, split=opts.split)
|
||||
|
||||
|
||||
def process_train_test(seqs: list, frame_range: dict, cat_id: int = 0, split: str = 'trian') -> None:
|
||||
"""
|
||||
处理MOT17的train 或 test
|
||||
由于操作相似 故另写函数
|
||||
|
||||
"""
|
||||
|
||||
for seq in seqs:
|
||||
print(f'Dealing with {split} dataset...')
|
||||
|
||||
img_dir = osp.join(DATA_ROOT, 'train', seq, 'img1') if split != 'test' else osp.join(DATA_ROOT, 'test', seq, 'img1') # 图片路径
|
||||
imgs = sorted(os.listdir(img_dir)) # 所有图片的相对路径
|
||||
seq_length = len(imgs) # 序列长度
|
||||
|
||||
if split != 'test':
|
||||
|
||||
# 求解图片高宽
|
||||
img_eg = cv2.imread(osp.join(img_dir, imgs[0]))
|
||||
w0, h0 = img_eg.shape[1], img_eg.shape[0] # 原始高宽
|
||||
|
||||
ann_of_seq_path = os.path.join(img_dir, '../', 'gt', 'gt.txt') # GT文件路径
|
||||
ann_of_seq = np.loadtxt(ann_of_seq_path, dtype=np.float32, delimiter=',') # GT内容
|
||||
|
||||
gt_to_path = osp.join(DATA_ROOT, 'labels', split, seq) # 要写入的真值文件夹
|
||||
# 如果不存在就创建
|
||||
if not osp.exists(gt_to_path):
|
||||
os.makedirs(gt_to_path)
|
||||
|
||||
exist_gts = [] # 初始化该列表 每个元素对应该seq的frame中有无真值框
|
||||
# 如果没有 就在train.txt产生图片路径
|
||||
|
||||
for idx, img in enumerate(imgs):
|
||||
# img 形如: img000001.jpg
|
||||
if idx < int(seq_length * frame_range['start']) or idx > int(seq_length * frame_range['end']):
|
||||
continue
|
||||
|
||||
# 第一步 产生图片软链接
|
||||
# print('step1, creating imgs symlink...')
|
||||
if opts.generate_imgs:
|
||||
img_to_path = osp.join(DATA_ROOT, 'images', split, seq) # 该序列图片存储位置
|
||||
|
||||
if not osp.exists(img_to_path):
|
||||
os.makedirs(img_to_path)
|
||||
|
||||
os.symlink(osp.join(img_dir, img),
|
||||
osp.join(img_to_path, img)) # 创建软链接
|
||||
|
||||
# 第二步 产生真值文件
|
||||
# print('step2, generating gt files...')
|
||||
ann_of_current_frame = ann_of_seq[ann_of_seq[:, 0] == float(idx + 1), :] # 筛选真值文件里本帧的目标信息
|
||||
exist_gts.append(True if ann_of_current_frame.shape[0] != 0 else False)
|
||||
|
||||
gt_to_file = osp.join(gt_to_path, img[: -4] + '.txt')
|
||||
|
||||
with open(gt_to_file, 'w') as f_gt:
|
||||
for i in range(ann_of_current_frame.shape[0]):
|
||||
if int(ann_of_current_frame[i][6]) == 1 and int(ann_of_current_frame[i][7]) == 1 \
|
||||
and float(ann_of_current_frame[i][8]) > 0.25:
|
||||
# bbox xywh
|
||||
x0, y0 = int(ann_of_current_frame[i][2]), int(ann_of_current_frame[i][3])
|
||||
x0, y0 = max(x0, 0), max(y0, 0)
|
||||
w, h = int(ann_of_current_frame[i][4]), int(ann_of_current_frame[i][5])
|
||||
|
||||
xc, yc = x0 + w // 2, y0 + h // 2 # 中心点 x y
|
||||
|
||||
# 归一化
|
||||
xc, yc = xc / w0, yc / h0
|
||||
xc, yc = min(xc, 1.0), min(yc, 1.0)
|
||||
w, h = w / w0, h / h0
|
||||
w, h = min(w, 1.0), min(h, 1.0)
|
||||
assert w <= 1 and h <= 1, f'{w}, {h} must be normed, original size{w0}, {h0}'
|
||||
assert xc >= 0 and yc >= 0, f'{x0}, {y0} must be positve'
|
||||
assert xc <= 1 and yc <= 1, f'{x0}, {y0} must be le than 1'
|
||||
category_id = cat_id
|
||||
|
||||
write_line = '{:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
|
||||
category_id, xc, yc, w, h)
|
||||
|
||||
f_gt.write(write_line)
|
||||
|
||||
f_gt.close()
|
||||
|
||||
else: # test 只产生图片软链接
|
||||
for idx, img in enumerate(imgs):
|
||||
# img 形如: img000001.jpg
|
||||
if idx < int(seq_length * frame_range['start']) or idx > int(seq_length * frame_range['end']):
|
||||
continue
|
||||
|
||||
# 第一步 产生图片软链接
|
||||
# print('step1, creating imgs symlink...')
|
||||
if opts.generate_imgs:
|
||||
img_to_path = osp.join(DATA_ROOT, 'images', split, seq) # 该序列图片存储位置
|
||||
|
||||
if not osp.exists(img_to_path):
|
||||
os.makedirs(img_to_path)
|
||||
|
||||
os.symlink(osp.join(img_dir, img),
|
||||
osp.join(img_to_path, img)) # 创建软链接
|
||||
|
||||
# 第三步 产生图片索引train.txt等
|
||||
print(f'generating img index file of {seq}')
|
||||
to_file = os.path.join('./mot17/', split + '.txt')
|
||||
with open(to_file, 'a') as f:
|
||||
for idx, img in enumerate(imgs):
|
||||
if idx < int(seq_length * frame_range['start']) or idx > int(seq_length * frame_range['end']):
|
||||
continue
|
||||
|
||||
if split == 'test' or exist_gts[idx]:
|
||||
f.write('MOT17/' + 'images/' + split + '/' \
|
||||
+ seq + '/' + img + '\n')
|
||||
|
||||
f.close()
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if not osp.exists('./mot17'):
|
||||
os.system('mkdir mot17')
|
||||
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--split', type=str, default='train', help='train, test or val')
|
||||
parser.add_argument('--generate_imgs', action='store_true', help='whether generate soft link of imgs')
|
||||
parser.add_argument('--certain_seqs', action='store_true', help='for debug')
|
||||
parser.add_argument('--half', action='store_true', help='half frames')
|
||||
parser.add_argument('--ratio', type=float, default=0.8, help='ratio of test dataset devide train dataset')
|
||||
parser.add_argument('--random', action='store_true', help='random split train and test')
|
||||
|
||||
opts = parser.parse_args()
|
||||
|
||||
generate_imgs_and_labels(opts)
|
||||
# python tools/convert_MOT17_to_yolo.py --split train --generate_imgs
|
159
yolov7-tracker-example/tools/convert_UAVDT_to_yolo.py
Normal file
159
yolov7-tracker-example/tools/convert_UAVDT_to_yolo.py
Normal file
@ -0,0 +1,159 @@
|
||||
"""
|
||||
将UAVDT转换为yolo v5格式
|
||||
class_id, xc_norm, yc_norm, w_norm, h_norm
|
||||
"""
|
||||
|
||||
import os
|
||||
import os.path as osp
|
||||
import argparse
|
||||
import cv2
|
||||
import glob
|
||||
import numpy as np
|
||||
import random
|
||||
|
||||
DATA_ROOT = '/data/wujiapeng/datasets/UAVDT/'
|
||||
|
||||
image_wh_dict = {} # seq->(w,h) 字典 用于归一化
|
||||
|
||||
def generate_imgs_and_labels(opts):
|
||||
"""
|
||||
产生图片路径的txt文件以及yolo格式真值文件
|
||||
"""
|
||||
seq_list = os.listdir(osp.join(DATA_ROOT, 'UAV-benchmark-M'))
|
||||
print('--------------------------')
|
||||
print(f'Total {len(seq_list)} seqs!!')
|
||||
# 划分train test
|
||||
if opts.random:
|
||||
random.shuffle(seq_list)
|
||||
|
||||
bound = int(opts.ratio * len(seq_list))
|
||||
train_seq_list = seq_list[: bound]
|
||||
test_seq_list = seq_list[bound:]
|
||||
del bound
|
||||
print(f'train dataset: {train_seq_list}')
|
||||
print(f'test dataset: {test_seq_list}')
|
||||
print('--------------------------')
|
||||
|
||||
if not osp.exists('./uavdt/'):
|
||||
os.makedirs('./uavdt/')
|
||||
|
||||
# 定义类别 UAVDT只有一类
|
||||
CATEGOTY_ID = 0 # car
|
||||
|
||||
# 定义帧数范围
|
||||
frame_range = {'start': 0.0, 'end': 1.0}
|
||||
if opts.half: # half 截取一半
|
||||
frame_range['end'] = 0.5
|
||||
|
||||
# 分别处理train与test
|
||||
process_train_test(train_seq_list, frame_range, CATEGOTY_ID, 'train')
|
||||
process_train_test(test_seq_list, {'start': 0.0, 'end': 1.0}, CATEGOTY_ID, 'test')
|
||||
print('All Done!!')
|
||||
|
||||
|
||||
def process_train_test(seqs: list, frame_range: dict, cat_id: int = 0, split: str = 'trian') -> None:
|
||||
"""
|
||||
处理UAVDT的train 或 test
|
||||
由于操作相似 故另写函数
|
||||
|
||||
"""
|
||||
|
||||
for seq in seqs:
|
||||
print('Dealing with train dataset...')
|
||||
|
||||
img_dir = osp.join(DATA_ROOT, 'UAV-benchmark-M', seq, 'img1') # 图片路径
|
||||
imgs = sorted(os.listdir(img_dir)) # 所有图片的相对路径
|
||||
seq_length = len(imgs) # 序列长度
|
||||
|
||||
# 求解图片高宽
|
||||
img_eg = cv2.imread(osp.join(img_dir, imgs[0]))
|
||||
w0, h0 = img_eg.shape[1], img_eg.shape[0] # 原始高宽
|
||||
|
||||
ann_of_seq_path = os.path.join(img_dir, '../', 'gt', 'gt.txt') # GT文件路径
|
||||
ann_of_seq = np.loadtxt(ann_of_seq_path, dtype=np.float32, delimiter=',') # GT内容
|
||||
|
||||
gt_to_path = osp.join(DATA_ROOT, 'labels', split, seq) # 要写入的真值文件夹
|
||||
# 如果不存在就创建
|
||||
if not osp.exists(gt_to_path):
|
||||
os.makedirs(gt_to_path)
|
||||
|
||||
exist_gts = [] # 初始化该列表 每个元素对应该seq的frame中有无真值框
|
||||
# 如果没有 就在train.txt产生图片路径
|
||||
|
||||
for idx, img in enumerate(imgs):
|
||||
# img 形如: img000001.jpg
|
||||
if idx < int(seq_length * frame_range['start']) or idx > int(seq_length * frame_range['end']):
|
||||
continue
|
||||
|
||||
# 第一步 产生图片软链接
|
||||
# print('step1, creating imgs symlink...')
|
||||
if opts.generate_imgs:
|
||||
img_to_path = osp.join(DATA_ROOT, 'images', split, seq) # 该序列图片存储位置
|
||||
|
||||
if not osp.exists(img_to_path):
|
||||
os.makedirs(img_to_path)
|
||||
|
||||
os.symlink(osp.join(img_dir, img),
|
||||
osp.join(img_to_path, img)) # 创建软链接
|
||||
|
||||
# 第二步 产生真值文件
|
||||
# print('step2, generating gt files...')
|
||||
ann_of_current_frame = ann_of_seq[ann_of_seq[:, 0] == float(idx + 1), :] # 筛选真值文件里本帧的目标信息
|
||||
exist_gts.append(True if ann_of_current_frame.shape[0] != 0 else False)
|
||||
|
||||
gt_to_file = osp.join(gt_to_path, img[:-4] + '.txt')
|
||||
|
||||
with open(gt_to_file, 'w') as f_gt:
|
||||
for i in range(ann_of_current_frame.shape[0]):
|
||||
if int(ann_of_current_frame[i][6]) == 1:
|
||||
# bbox xywh
|
||||
x0, y0 = int(ann_of_current_frame[i][2]), int(ann_of_current_frame[i][3])
|
||||
w, h = int(ann_of_current_frame[i][4]), int(ann_of_current_frame[i][5])
|
||||
|
||||
xc, yc = x0 + w // 2, y0 + h // 2 # 中心点 x y
|
||||
|
||||
# 归一化
|
||||
xc, yc = xc / w0, yc / h0
|
||||
w, h = w / w0, h / h0
|
||||
category_id = cat_id
|
||||
|
||||
write_line = '{:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
|
||||
category_id, xc, yc, w, h)
|
||||
|
||||
f_gt.write(write_line)
|
||||
|
||||
f_gt.close()
|
||||
|
||||
# 第三步 产生图片索引train.txt等
|
||||
print(f'generating img index file of {seq}')
|
||||
to_file = os.path.join('./uavdt/', split + '.txt')
|
||||
with open(to_file, 'a') as f:
|
||||
for idx, img in enumerate(imgs):
|
||||
if idx < int(seq_length * frame_range['start']) or idx > int(seq_length * frame_range['end']):
|
||||
continue
|
||||
|
||||
if exist_gts[idx]:
|
||||
f.write('UAVDT/' + 'images/' + split + '/' \
|
||||
+ seq + '/' + img + '\n')
|
||||
|
||||
f.close()
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if not osp.exists('./uavdt'):
|
||||
os.system('mkdir ./uavdt')
|
||||
else:
|
||||
os.system('rm -rf ./uavdt/*')
|
||||
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--generate_imgs', action='store_true', help='whether generate soft link of imgs')
|
||||
parser.add_argument('--certain_seqs', action='store_true', help='for debug')
|
||||
parser.add_argument('--half', action='store_true', help='half frames')
|
||||
parser.add_argument('--ratio', type=float, default=0.8, help='ratio of test dataset devide train dataset')
|
||||
parser.add_argument('--random', action='store_true', help='random split train and test')
|
||||
|
||||
opts = parser.parse_args()
|
||||
|
||||
generate_imgs_and_labels(opts)
|
||||
# python tools/convert_UAVDT_to_yolo.py --generate_imgs --half --random
|
182
yolov7-tracker-example/tools/convert_VisDrone_to_yolo.py
Normal file
182
yolov7-tracker-example/tools/convert_VisDrone_to_yolo.py
Normal file
@ -0,0 +1,182 @@
|
||||
"""
|
||||
将VisDrone转换为yolo v5格式
|
||||
class_id, xc_norm, yc_norm, w_norm, h_norm
|
||||
"""
|
||||
import os
|
||||
import os.path as osp
|
||||
import argparse
|
||||
import cv2
|
||||
import glob
|
||||
|
||||
DATA_ROOT = '/data/wujiapeng/datasets/VisDrone2019/VisDrone2019'
|
||||
|
||||
|
||||
# 以下两个seqs只跟踪车的时候有用
|
||||
certain_seqs = ['uav0000071_03240_v', 'uav0000072_04488_v','uav0000072_05448_v', 'uav0000072_06432_v','uav0000124_00944_v','uav0000126_00001_v','uav0000138_00000_v','uav0000145_00000_v','uav0000150_02310_v','uav0000222_03150_v','uav0000239_12336_v','uav0000243_00001_v',
|
||||
'uav0000248_00001_v','uav0000263_03289_v','uav0000266_03598_v','uav0000273_00001_v','uav0000279_00001_v','uav0000281_00460_v','uav0000289_00001_v','uav0000289_06922_v','uav0000307_00000_v',
|
||||
'uav0000308_00000_v','uav0000308_01380_v','uav0000326_01035_v','uav0000329_04715_v','uav0000361_02323_v','uav0000366_00001_v']
|
||||
|
||||
ignored_seqs = ['uav0000013_00000_v', 'uav0000013_01073_v', 'uav0000013_01392_v',
|
||||
'uav0000020_00406_v', 'uav0000079_00480_v',
|
||||
'uav0000084_00000_v', 'uav0000099_02109_v', 'uav0000086_00000_v',
|
||||
'uav0000073_00600_v', 'uav0000073_04464_v', 'uav0000088_00290_v']
|
||||
|
||||
image_wh_dict = {} # seq->(w,h) 字典 用于归一化
|
||||
|
||||
def generate_imgs(split_name='VisDrone2019-MOT-train', generate_imgs=True, if_certain_seqs=False, car_only=False):
|
||||
"""
|
||||
产生图片文件夹 例如 VisDrone/images/VisDrone2019-MOT-train/uav0000076_00720_v/000010.jpg
|
||||
同时产生序列->高,宽的字典 便于后续
|
||||
|
||||
split: str, 'VisDrone2019-MOT-train', 'VisDrone2019-MOT-val' or 'VisDrone2019-MOT-test-dev'
|
||||
if_certain_seqs: bool, use for debug.
|
||||
"""
|
||||
|
||||
if not if_certain_seqs:
|
||||
seq_list = os.listdir(osp.join(DATA_ROOT, split_name, 'sequences')) # 所有序列名称
|
||||
else:
|
||||
seq_list = certain_seqs
|
||||
|
||||
if car_only: # 只跟踪车就忽略行人多的视频
|
||||
seq_list = [seq for seq in seq_list if seq not in ignored_seqs]
|
||||
|
||||
# 遍历所有序列 给图片创建软链接 同时更新seq->(w,h)字典
|
||||
if_write_txt = True if glob.glob('./visdrone/*.txt') else False
|
||||
# if_write_txt = True if not osp.exists(f'./visdrone/.txt') else False # 是否需要写txt 用于生成visdrone.train
|
||||
|
||||
if not if_write_txt:
|
||||
for seq in seq_list:
|
||||
img_dir = osp.join(DATA_ROOT, split_name, 'sequences', seq) # 该序列下所有图片路径
|
||||
|
||||
imgs = sorted(os.listdir(img_dir)) # 所有图片
|
||||
|
||||
if generate_imgs:
|
||||
to_path = osp.join(DATA_ROOT, 'images', split_name, seq) # 该序列图片存储位置
|
||||
if not osp.exists(to_path):
|
||||
os.makedirs(to_path)
|
||||
|
||||
for img in imgs: # 遍历该序列下的图片
|
||||
os.symlink(osp.join(img_dir, img),
|
||||
osp.join(to_path, img)) # 创建软链接
|
||||
|
||||
img_sample = cv2.imread(osp.join(img_dir, imgs[0])) # 每个序列第一张图片 用于获取w, h
|
||||
w, h = img_sample.shape[1], img_sample.shape[0] # w, h
|
||||
|
||||
image_wh_dict[seq] = (w, h) # 更新seq->(w,h) 字典
|
||||
|
||||
# print(image_wh_dict)
|
||||
# return
|
||||
else:
|
||||
with open('./visdrone.txt', 'a') as f:
|
||||
for seq in seq_list:
|
||||
img_dir = osp.join(DATA_ROOT, split_name, 'sequences', seq) # 该序列下所有图片路径
|
||||
|
||||
imgs = sorted(os.listdir(img_dir)) # 所有图片
|
||||
|
||||
if generate_imgs:
|
||||
to_path = osp.join(DATA_ROOT, 'images', split_name, seq) # 该序列图片存储位置
|
||||
if not osp.exists(to_path):
|
||||
os.makedirs(to_path)
|
||||
|
||||
for img in imgs: # 遍历该序列下的图片
|
||||
|
||||
f.write('VisDrone2019/' + 'VisDrone2019/' + 'images/' + split_name + '/' \
|
||||
+ seq + '/' + img + '\n')
|
||||
|
||||
os.symlink(osp.join(img_dir, img),
|
||||
osp.join(to_path, img)) # 创建软链接
|
||||
|
||||
img_sample = cv2.imread(osp.join(img_dir, imgs[0])) # 每个序列第一张图片 用于获取w, h
|
||||
w, h = img_sample.shape[1], img_sample.shape[0] # w, h
|
||||
|
||||
image_wh_dict[seq] = (w, h) # 更新seq->(w,h) 字典
|
||||
f.close()
|
||||
if if_certain_seqs: # for debug
|
||||
print(image_wh_dict)
|
||||
|
||||
|
||||
def generate_labels(split='VisDrone2019-MOT-train', if_certain_seqs=False, car_only=False):
|
||||
"""
|
||||
split: str, 'train', 'val' or 'test'
|
||||
if_certain_seqs: bool, use for debug.
|
||||
"""
|
||||
# from choose_anchors import image_wh_dict
|
||||
# print(image_wh_dict)
|
||||
if not if_certain_seqs:
|
||||
seq_list = os.listdir(osp.join(DATA_ROOT, split, 'sequences')) # 序列列表
|
||||
else:
|
||||
seq_list = certain_seqs
|
||||
|
||||
if car_only: # 只跟踪车就忽略行人多的视频
|
||||
seq_list = [seq for seq in seq_list if seq not in ignored_seqs]
|
||||
category_list = ['4', '5', '6', '9']
|
||||
else:
|
||||
category_list = [str(i) for i in range(1, 11)]
|
||||
|
||||
# 类别ID 从0开始
|
||||
category_dict = {category_list[idx]: idx for idx in range(len(category_list))}
|
||||
# 每张图片分配一个txt
|
||||
# 要从sequence的txt里分出来
|
||||
for seq in seq_list:
|
||||
seq_dir = osp.join(DATA_ROOT, split, 'annotations', seq + '.txt') # 真值文件
|
||||
with open(seq_dir, 'r') as f:
|
||||
lines = f.readlines()
|
||||
|
||||
for row in lines:
|
||||
|
||||
current_line = row.split(',')
|
||||
|
||||
frame = current_line[0] # 第几帧
|
||||
if current_line[6] == '0' or current_line[7] not in category_list:
|
||||
continue
|
||||
|
||||
to_file = osp.join(DATA_ROOT, 'labels', split, seq) # 要写入的文件名
|
||||
# 如果不存在就创建
|
||||
if not osp.exists(to_file):
|
||||
os.makedirs(to_file)
|
||||
|
||||
to_file = osp.join(to_file, frame.zfill(7) + '.txt')
|
||||
|
||||
category_id = category_dict[current_line[7]]
|
||||
x0, y0 = int(current_line[2]), int(current_line[3]) # 左上角 x y
|
||||
w, h = int(current_line[4]), int(current_line[5]) # 宽 高
|
||||
|
||||
x_c, y_c = x0 + w // 2, y0 + h // 2 # 中心点 x y
|
||||
|
||||
image_w, image_h = image_wh_dict[seq][0], image_wh_dict[seq][1] # 图像高宽
|
||||
# 归一化
|
||||
w, h = w / image_w, h / image_h
|
||||
x_c, y_c = x_c / image_w, y_c / image_h
|
||||
|
||||
|
||||
with open(to_file, 'a') as f_to:
|
||||
write_line = '{:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
|
||||
category_id, x_c, y_c, w, h)
|
||||
|
||||
f_to.write(write_line)
|
||||
|
||||
f_to.close()
|
||||
|
||||
|
||||
f.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--split', type=str, default='VisDrone2019-MOT-train', help='train or test')
|
||||
parser.add_argument('--generate_imgs', action='store_true', help='whether generate soft link of imgs')
|
||||
parser.add_argument('--car_only', action='store_true', help='only cars')
|
||||
parser.add_argument('--if_certain_seqs', action='store_true', help='for debug')
|
||||
|
||||
opt = parser.parse_args()
|
||||
print('generating images...')
|
||||
generate_imgs(opt.split, opt.generate_imgs, opt.if_certain_seqs, opt.car_only)
|
||||
|
||||
print('generating labels...')
|
||||
generate_labels(opt.split, opt.if_certain_seqs, opt.car_only)
|
||||
|
||||
print('Done!')
|
||||
|
||||
|
||||
# python convert_VisDrone_to_yolo.py --split VisDrone2019-MOT-train
|
||||
# python convert_VisDrone_to_yolo.py --split VisDrone2019-MOT-train --car_only --if_certain_seqs
|
168
yolov7-tracker-example/tools/convert_VisDrone_to_yolov2.py
Normal file
168
yolov7-tracker-example/tools/convert_VisDrone_to_yolov2.py
Normal file
@ -0,0 +1,168 @@
|
||||
"""
|
||||
将VisDrone转换为yolo v5格式
|
||||
class_id, xc_norm, yc_norm, w_norm, h_norm
|
||||
|
||||
改动:
|
||||
1. 将产生img和label函数合成一个
|
||||
2. 增加如果无label就不产生当前img路径的功能
|
||||
3. 增加half选项 每个视频截取一半
|
||||
"""
|
||||
import os
|
||||
import os.path as osp
|
||||
import argparse
|
||||
import cv2
|
||||
import glob
|
||||
import numpy as np
|
||||
|
||||
DATA_ROOT = '/data/wujiapeng/datasets/VisDrone2019/VisDrone2019'
|
||||
|
||||
|
||||
# 以下两个seqs只跟踪车的时候有用
|
||||
certain_seqs = ['uav0000071_03240_v', 'uav0000072_04488_v','uav0000072_05448_v', 'uav0000072_06432_v','uav0000124_00944_v','uav0000126_00001_v','uav0000138_00000_v','uav0000145_00000_v','uav0000150_02310_v','uav0000222_03150_v','uav0000239_12336_v','uav0000243_00001_v',
|
||||
'uav0000248_00001_v','uav0000263_03289_v','uav0000266_03598_v','uav0000273_00001_v','uav0000279_00001_v','uav0000281_00460_v','uav0000289_00001_v','uav0000289_06922_v','uav0000307_00000_v',
|
||||
'uav0000308_00000_v','uav0000308_01380_v','uav0000326_01035_v','uav0000329_04715_v','uav0000361_02323_v','uav0000366_00001_v']
|
||||
|
||||
ignored_seqs = ['uav0000013_00000_v', 'uav0000013_01073_v', 'uav0000013_01392_v',
|
||||
'uav0000020_00406_v', 'uav0000079_00480_v',
|
||||
'uav0000084_00000_v', 'uav0000099_02109_v', 'uav0000086_00000_v',
|
||||
'uav0000073_00600_v', 'uav0000073_04464_v', 'uav0000088_00290_v']
|
||||
|
||||
image_wh_dict = {} # seq->(w,h) 字典 用于归一化
|
||||
|
||||
def generate_imgs_and_labels(opts):
|
||||
"""
|
||||
产生图片路径的txt文件以及yolo格式真值文件
|
||||
"""
|
||||
if not opts.certain_seqs:
|
||||
seq_list = os.listdir(osp.join(DATA_ROOT, opts.split_name, 'sequences')) # 所有序列名称
|
||||
else:
|
||||
seq_list = certain_seqs
|
||||
|
||||
if opts.car_only: # 只跟踪车就忽略行人多的视频
|
||||
seq_list = [seq for seq in seq_list if seq not in ignored_seqs]
|
||||
category_list = [4, 5, 6, 9] # 感兴趣的类别编号 List[int]
|
||||
else:
|
||||
category_list = [i for i in range(1, 11)]
|
||||
|
||||
print(f'Total {len(seq_list)} seqs!!')
|
||||
if not osp.exists('./visdrone/'):
|
||||
os.makedirs('./visdrone/')
|
||||
|
||||
# 类别ID 从0开始
|
||||
category_dict = {category_list[idx]: idx for idx in range(len(category_list))}
|
||||
|
||||
txt_name_dict = {'VisDrone2019-MOT-train': 'train',
|
||||
'VisDrone2019-MOT-val': 'val',
|
||||
'VisDrone2019-MOT-test-dev': 'test'} # 产生txt文件名称对应关系
|
||||
|
||||
# 如果已经存在就不写了
|
||||
write_txt = False if os.path.isfile(os.path.join('./visdrone', txt_name_dict[opts.split_name] + '.txt')) else True
|
||||
print(f'write txt is {write_txt}')
|
||||
|
||||
frame_range = {'start': 0.0, 'end': 1.0}
|
||||
if opts.half: # VisDrone-half 截取一半
|
||||
frame_range['end'] = 0.5
|
||||
|
||||
# 以序列为单位进行处理
|
||||
for seq in seq_list:
|
||||
img_dir = osp.join(DATA_ROOT, opts.split_name, 'sequences', seq) # 该序列下所有图片路径
|
||||
|
||||
imgs = sorted(os.listdir(img_dir)) # 所有图片
|
||||
seq_length = len(imgs) # 序列长度
|
||||
|
||||
img_eg = cv2.imread(os.path.join(img_dir, imgs[0])) # 序列的第一张图 用以计算高宽
|
||||
w0, h0 = img_eg.shape[1], img_eg.shape[0] # 原始高宽
|
||||
|
||||
ann_of_seq_path = os.path.join(DATA_ROOT, opts.split_name, 'annotations', seq + '.txt') # GT文件路径
|
||||
ann_of_seq = np.loadtxt(ann_of_seq_path, dtype=np.float32, delimiter=',') # GT内容
|
||||
|
||||
gt_to_path = osp.join(DATA_ROOT, 'labels', opts.split_name, seq) # 要写入的真值文件夹
|
||||
# 如果不存在就创建
|
||||
if not osp.exists(gt_to_path):
|
||||
os.makedirs(gt_to_path)
|
||||
|
||||
exist_gts = [] # 初始化该列表 每个元素对应该seq的frame中有无真值框
|
||||
# 如果没有 就在train.txt产生图片路径
|
||||
|
||||
for idx, img in enumerate(imgs):
|
||||
# img: 相对路径 即 图片名称 0000001.jpg
|
||||
if idx < int(seq_length * frame_range['start']) or idx > int(seq_length * frame_range['end']):
|
||||
continue
|
||||
|
||||
# 第一步 产生图片软链接
|
||||
# print('step1, creating imgs symlink...')
|
||||
if opts.generate_imgs:
|
||||
img_to_path = osp.join(DATA_ROOT, 'images', opts.split_name, seq) # 该序列图片存储位置
|
||||
|
||||
if not osp.exists(img_to_path):
|
||||
os.makedirs(img_to_path)
|
||||
|
||||
os.symlink(osp.join(img_dir, img),
|
||||
osp.join(img_to_path, img)) # 创建软链接
|
||||
# print('Done!\n')
|
||||
|
||||
# 第二步 产生真值文件
|
||||
# print('step2, generating gt files...')
|
||||
|
||||
# 根据本序列的真值文件读取
|
||||
# ann_idx = int(ann_of_seq[:, 0]) == idx + 1
|
||||
ann_of_current_frame = ann_of_seq[ann_of_seq[:, 0] == float(idx + 1), :] # 筛选真值文件里本帧的目标信息
|
||||
exist_gts.append(True if ann_of_current_frame.shape[0] != 0 else False)
|
||||
|
||||
gt_to_file = osp.join(gt_to_path, img[:-4] + '.txt')
|
||||
|
||||
with open(gt_to_file, 'a') as f_gt:
|
||||
for i in range(ann_of_current_frame.shape[0]):
|
||||
|
||||
category = int(ann_of_current_frame[i][7])
|
||||
if int(ann_of_current_frame[i][6]) == 1 and category in category_list:
|
||||
|
||||
# bbox xywh
|
||||
x0, y0 = int(ann_of_current_frame[i][2]), int(ann_of_current_frame[i][3])
|
||||
w, h = int(ann_of_current_frame[i][4]), int(ann_of_current_frame[i][5])
|
||||
|
||||
xc, yc = x0 + w // 2, y0 + h // 2 # 中心点 x y
|
||||
|
||||
# 归一化
|
||||
xc, yc = xc / w0, yc / h0
|
||||
w, h = w / w0, h / h0
|
||||
|
||||
category_id = category_dict[category]
|
||||
|
||||
write_line = '{:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
|
||||
category_id, xc, yc, w, h)
|
||||
|
||||
f_gt.write(write_line)
|
||||
|
||||
f_gt.close()
|
||||
# print('Done!\n')
|
||||
print(f'img symlink and gt files of seq {seq} Done!')
|
||||
# 第三步 产生图片索引train.txt等
|
||||
print(f'generating img index file of {seq}')
|
||||
if write_txt:
|
||||
to_file = os.path.join('./visdrone', txt_name_dict[opts.split_name] + '.txt')
|
||||
with open(to_file, 'a') as f:
|
||||
for idx, img in enumerate(imgs):
|
||||
if idx < int(seq_length * frame_range['start']) or idx > int(seq_length * frame_range['end']):
|
||||
continue
|
||||
|
||||
if exist_gts[idx]:
|
||||
f.write('VisDrone2019/' + 'VisDrone2019/' + 'images/' + opts.split_name + '/' \
|
||||
+ seq + '/' + img + '\n')
|
||||
|
||||
f.close()
|
||||
|
||||
print('All done!!')
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--split_name', type=str, default='VisDrone2019-MOT-train', help='train or test')
|
||||
parser.add_argument('--generate_imgs', action='store_true', help='whether generate soft link of imgs')
|
||||
parser.add_argument('--car_only', action='store_true', help='only cars')
|
||||
parser.add_argument('--certain_seqs', action='store_true', help='for debug')
|
||||
parser.add_argument('--half', action='store_true', help='half frames')
|
||||
|
||||
opts = parser.parse_args()
|
||||
|
||||
generate_imgs_and_labels(opts)
|
||||
# python tools/convert_VisDrone_to_yolov2.py --split_name VisDrone2019-MOT-train --generate_imgs --car_only --half
|
479
yolov7-tracker-example/tools/reparameterization.ipynb
Normal file
479
yolov7-tracker-example/tools/reparameterization.ipynb
Normal file
@ -0,0 +1,479 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d7cbe5ee",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Reparameterization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "13393b70",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## YOLOv7 reparameterization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "bf53becf",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# import\n",
|
||||
"from copy import deepcopy\n",
|
||||
"from models.yolo import Model\n",
|
||||
"import torch\n",
|
||||
"from utils.torch_utils import select_device, is_parallel\n",
|
||||
"\n",
|
||||
"device = select_device('0', batch_size=1)\n",
|
||||
"# model trained by cfg/training/*.yaml\n",
|
||||
"ckpt = torch.load('cfg/training/yolov7.pt', map_location=device)\n",
|
||||
"# reparameterized model in cfg/deploy/*.yaml\n",
|
||||
"model = Model('cfg/deploy/yolov7.yaml', ch=3, nc=80).to(device)\n",
|
||||
"\n",
|
||||
"# copy intersect weights\n",
|
||||
"state_dict = ckpt['model'].float().state_dict()\n",
|
||||
"exclude = []\n",
|
||||
"intersect_state_dict = {k: v for k, v in state_dict.items() if k in model.state_dict() and not any(x in k for x in exclude) and v.shape == model.state_dict()[k].shape}\n",
|
||||
"model.load_state_dict(intersect_state_dict, strict=False)\n",
|
||||
"model.names = ckpt['model'].names\n",
|
||||
"model.nc = ckpt['model'].nc\n",
|
||||
"\n",
|
||||
"# reparametrized YOLOR\n",
|
||||
"for i in range(255):\n",
|
||||
" model.state_dict()['model.105.m.0.weight'].data[i, :, :, :] *= state_dict['model.105.im.0.implicit'].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.105.m.1.weight'].data[i, :, :, :] *= state_dict['model.105.im.1.implicit'].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.105.m.2.weight'].data[i, :, :, :] *= state_dict['model.105.im.2.implicit'].data[:, i, : :].squeeze()\n",
|
||||
"model.state_dict()['model.105.m.0.bias'].data += state_dict['model.105.m.0.weight'].mul(state_dict['model.105.ia.0.implicit']).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.105.m.1.bias'].data += state_dict['model.105.m.1.weight'].mul(state_dict['model.105.ia.1.implicit']).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.105.m.2.bias'].data += state_dict['model.105.m.2.weight'].mul(state_dict['model.105.ia.2.implicit']).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.105.m.0.bias'].data *= state_dict['model.105.im.0.implicit'].data.squeeze()\n",
|
||||
"model.state_dict()['model.105.m.1.bias'].data *= state_dict['model.105.im.1.implicit'].data.squeeze()\n",
|
||||
"model.state_dict()['model.105.m.2.bias'].data *= state_dict['model.105.im.2.implicit'].data.squeeze()\n",
|
||||
"\n",
|
||||
"# model to be saved\n",
|
||||
"ckpt = {'model': deepcopy(model.module if is_parallel(model) else model).half(),\n",
|
||||
" 'optimizer': None,\n",
|
||||
" 'training_results': None,\n",
|
||||
" 'epoch': -1}\n",
|
||||
"\n",
|
||||
"# save reparameterized model\n",
|
||||
"torch.save(ckpt, 'cfg/deploy/yolov7.pt')\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5b396a53",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## YOLOv7x reparameterization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9d54d17f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# import\n",
|
||||
"from copy import deepcopy\n",
|
||||
"from models.yolo import Model\n",
|
||||
"import torch\n",
|
||||
"from utils.torch_utils import select_device, is_parallel\n",
|
||||
"\n",
|
||||
"device = select_device('0', batch_size=1)\n",
|
||||
"# model trained by cfg/training/*.yaml\n",
|
||||
"ckpt = torch.load('cfg/training/yolov7x.pt', map_location=device)\n",
|
||||
"# reparameterized model in cfg/deploy/*.yaml\n",
|
||||
"model = Model('cfg/deploy/yolov7x.yaml', ch=3, nc=80).to(device)\n",
|
||||
"\n",
|
||||
"# copy intersect weights\n",
|
||||
"state_dict = ckpt['model'].float().state_dict()\n",
|
||||
"exclude = []\n",
|
||||
"intersect_state_dict = {k: v for k, v in state_dict.items() if k in model.state_dict() and not any(x in k for x in exclude) and v.shape == model.state_dict()[k].shape}\n",
|
||||
"model.load_state_dict(intersect_state_dict, strict=False)\n",
|
||||
"model.names = ckpt['model'].names\n",
|
||||
"model.nc = ckpt['model'].nc\n",
|
||||
"\n",
|
||||
"# reparametrized YOLOR\n",
|
||||
"for i in range(255):\n",
|
||||
" model.state_dict()['model.121.m.0.weight'].data[i, :, :, :] *= state_dict['model.121.im.0.implicit'].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.121.m.1.weight'].data[i, :, :, :] *= state_dict['model.121.im.1.implicit'].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.121.m.2.weight'].data[i, :, :, :] *= state_dict['model.121.im.2.implicit'].data[:, i, : :].squeeze()\n",
|
||||
"model.state_dict()['model.121.m.0.bias'].data += state_dict['model.121.m.0.weight'].mul(state_dict['model.121.ia.0.implicit']).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.121.m.1.bias'].data += state_dict['model.121.m.1.weight'].mul(state_dict['model.121.ia.1.implicit']).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.121.m.2.bias'].data += state_dict['model.121.m.2.weight'].mul(state_dict['model.121.ia.2.implicit']).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.121.m.0.bias'].data *= state_dict['model.121.im.0.implicit'].data.squeeze()\n",
|
||||
"model.state_dict()['model.121.m.1.bias'].data *= state_dict['model.121.im.1.implicit'].data.squeeze()\n",
|
||||
"model.state_dict()['model.121.m.2.bias'].data *= state_dict['model.121.im.2.implicit'].data.squeeze()\n",
|
||||
"\n",
|
||||
"# model to be saved\n",
|
||||
"ckpt = {'model': deepcopy(model.module if is_parallel(model) else model).half(),\n",
|
||||
" 'optimizer': None,\n",
|
||||
" 'training_results': None,\n",
|
||||
" 'epoch': -1}\n",
|
||||
"\n",
|
||||
"# save reparameterized model\n",
|
||||
"torch.save(ckpt, 'cfg/deploy/yolov7x.pt')\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "11a9108e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## YOLOv7-W6 reparameterization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d032c629",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# import\n",
|
||||
"from copy import deepcopy\n",
|
||||
"from models.yolo import Model\n",
|
||||
"import torch\n",
|
||||
"from utils.torch_utils import select_device, is_parallel\n",
|
||||
"\n",
|
||||
"device = select_device('0', batch_size=1)\n",
|
||||
"# model trained by cfg/training/*.yaml\n",
|
||||
"ckpt = torch.load('cfg/training/yolov7-w6.pt', map_location=device)\n",
|
||||
"# reparameterized model in cfg/deploy/*.yaml\n",
|
||||
"model = Model('cfg/deploy/yolov7-w6.yaml', ch=3, nc=80).to(device)\n",
|
||||
"\n",
|
||||
"# copy intersect weights\n",
|
||||
"state_dict = ckpt['model'].float().state_dict()\n",
|
||||
"exclude = []\n",
|
||||
"intersect_state_dict = {k: v for k, v in state_dict.items() if k in model.state_dict() and not any(x in k for x in exclude) and v.shape == model.state_dict()[k].shape}\n",
|
||||
"model.load_state_dict(intersect_state_dict, strict=False)\n",
|
||||
"model.names = ckpt['model'].names\n",
|
||||
"model.nc = ckpt['model'].nc\n",
|
||||
"\n",
|
||||
"idx = 118\n",
|
||||
"idx2 = 122\n",
|
||||
"\n",
|
||||
"# copy weights of lead head\n",
|
||||
"model.state_dict()['model.{}.m.0.weight'.format(idx)].data -= model.state_dict()['model.{}.m.0.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.weight'.format(idx)].data -= model.state_dict()['model.{}.m.1.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.weight'.format(idx)].data -= model.state_dict()['model.{}.m.2.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.weight'.format(idx)].data -= model.state_dict()['model.{}.m.3.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.weight'.format(idx)].data += state_dict['model.{}.m.0.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.weight'.format(idx)].data += state_dict['model.{}.m.1.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.weight'.format(idx)].data += state_dict['model.{}.m.2.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.weight'.format(idx)].data += state_dict['model.{}.m.3.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data -= model.state_dict()['model.{}.m.0.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data -= model.state_dict()['model.{}.m.1.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data -= model.state_dict()['model.{}.m.2.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data -= model.state_dict()['model.{}.m.3.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data += state_dict['model.{}.m.0.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data += state_dict['model.{}.m.1.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data += state_dict['model.{}.m.2.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data += state_dict['model.{}.m.3.bias'.format(idx2)].data\n",
|
||||
"\n",
|
||||
"# reparametrized YOLOR\n",
|
||||
"for i in range(255):\n",
|
||||
" model.state_dict()['model.{}.m.0.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.0.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.1.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.1.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.2.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.2.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.3.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.3.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data += state_dict['model.{}.m.0.weight'.format(idx2)].mul(state_dict['model.{}.ia.0.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data += state_dict['model.{}.m.1.weight'.format(idx2)].mul(state_dict['model.{}.ia.1.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data += state_dict['model.{}.m.2.weight'.format(idx2)].mul(state_dict['model.{}.ia.2.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data += state_dict['model.{}.m.3.weight'.format(idx2)].mul(state_dict['model.{}.ia.3.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data *= state_dict['model.{}.im.0.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data *= state_dict['model.{}.im.1.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data *= state_dict['model.{}.im.2.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data *= state_dict['model.{}.im.3.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"\n",
|
||||
"# model to be saved\n",
|
||||
"ckpt = {'model': deepcopy(model.module if is_parallel(model) else model).half(),\n",
|
||||
" 'optimizer': None,\n",
|
||||
" 'training_results': None,\n",
|
||||
" 'epoch': -1}\n",
|
||||
"\n",
|
||||
"# save reparameterized model\n",
|
||||
"torch.save(ckpt, 'cfg/deploy/yolov7-w6.pt')\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5f093d43",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## YOLOv7-E6 reparameterization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "aa2b2142",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# import\n",
|
||||
"from copy import deepcopy\n",
|
||||
"from models.yolo import Model\n",
|
||||
"import torch\n",
|
||||
"from utils.torch_utils import select_device, is_parallel\n",
|
||||
"\n",
|
||||
"device = select_device('0', batch_size=1)\n",
|
||||
"# model trained by cfg/training/*.yaml\n",
|
||||
"ckpt = torch.load('cfg/training/yolov7-e6.pt', map_location=device)\n",
|
||||
"# reparameterized model in cfg/deploy/*.yaml\n",
|
||||
"model = Model('cfg/deploy/yolov7-e6.yaml', ch=3, nc=80).to(device)\n",
|
||||
"\n",
|
||||
"# copy intersect weights\n",
|
||||
"state_dict = ckpt['model'].float().state_dict()\n",
|
||||
"exclude = []\n",
|
||||
"intersect_state_dict = {k: v for k, v in state_dict.items() if k in model.state_dict() and not any(x in k for x in exclude) and v.shape == model.state_dict()[k].shape}\n",
|
||||
"model.load_state_dict(intersect_state_dict, strict=False)\n",
|
||||
"model.names = ckpt['model'].names\n",
|
||||
"model.nc = ckpt['model'].nc\n",
|
||||
"\n",
|
||||
"idx = 140\n",
|
||||
"idx2 = 144\n",
|
||||
"\n",
|
||||
"# copy weights of lead head\n",
|
||||
"model.state_dict()['model.{}.m.0.weight'.format(idx)].data -= model.state_dict()['model.{}.m.0.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.weight'.format(idx)].data -= model.state_dict()['model.{}.m.1.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.weight'.format(idx)].data -= model.state_dict()['model.{}.m.2.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.weight'.format(idx)].data -= model.state_dict()['model.{}.m.3.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.weight'.format(idx)].data += state_dict['model.{}.m.0.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.weight'.format(idx)].data += state_dict['model.{}.m.1.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.weight'.format(idx)].data += state_dict['model.{}.m.2.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.weight'.format(idx)].data += state_dict['model.{}.m.3.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data -= model.state_dict()['model.{}.m.0.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data -= model.state_dict()['model.{}.m.1.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data -= model.state_dict()['model.{}.m.2.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data -= model.state_dict()['model.{}.m.3.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data += state_dict['model.{}.m.0.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data += state_dict['model.{}.m.1.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data += state_dict['model.{}.m.2.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data += state_dict['model.{}.m.3.bias'.format(idx2)].data\n",
|
||||
"\n",
|
||||
"# reparametrized YOLOR\n",
|
||||
"for i in range(255):\n",
|
||||
" model.state_dict()['model.{}.m.0.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.0.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.1.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.1.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.2.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.2.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.3.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.3.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data += state_dict['model.{}.m.0.weight'.format(idx2)].mul(state_dict['model.{}.ia.0.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data += state_dict['model.{}.m.1.weight'.format(idx2)].mul(state_dict['model.{}.ia.1.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data += state_dict['model.{}.m.2.weight'.format(idx2)].mul(state_dict['model.{}.ia.2.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data += state_dict['model.{}.m.3.weight'.format(idx2)].mul(state_dict['model.{}.ia.3.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data *= state_dict['model.{}.im.0.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data *= state_dict['model.{}.im.1.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data *= state_dict['model.{}.im.2.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data *= state_dict['model.{}.im.3.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"\n",
|
||||
"# model to be saved\n",
|
||||
"ckpt = {'model': deepcopy(model.module if is_parallel(model) else model).half(),\n",
|
||||
" 'optimizer': None,\n",
|
||||
" 'training_results': None,\n",
|
||||
" 'epoch': -1}\n",
|
||||
"\n",
|
||||
"# save reparameterized model\n",
|
||||
"torch.save(ckpt, 'cfg/deploy/yolov7-e6.pt')\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a3bccf89",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## YOLOv7-D6 reparameterization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e5216b70",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# import\n",
|
||||
"from copy import deepcopy\n",
|
||||
"from models.yolo import Model\n",
|
||||
"import torch\n",
|
||||
"from utils.torch_utils import select_device, is_parallel\n",
|
||||
"\n",
|
||||
"device = select_device('0', batch_size=1)\n",
|
||||
"# model trained by cfg/training/*.yaml\n",
|
||||
"ckpt = torch.load('cfg/training/yolov7-d6.pt', map_location=device)\n",
|
||||
"# reparameterized model in cfg/deploy/*.yaml\n",
|
||||
"model = Model('cfg/deploy/yolov7-d6.yaml', ch=3, nc=80).to(device)\n",
|
||||
"\n",
|
||||
"# copy intersect weights\n",
|
||||
"state_dict = ckpt['model'].float().state_dict()\n",
|
||||
"exclude = []\n",
|
||||
"intersect_state_dict = {k: v for k, v in state_dict.items() if k in model.state_dict() and not any(x in k for x in exclude) and v.shape == model.state_dict()[k].shape}\n",
|
||||
"model.load_state_dict(intersect_state_dict, strict=False)\n",
|
||||
"model.names = ckpt['model'].names\n",
|
||||
"model.nc = ckpt['model'].nc\n",
|
||||
"\n",
|
||||
"idx = 162\n",
|
||||
"idx2 = 166\n",
|
||||
"\n",
|
||||
"# copy weights of lead head\n",
|
||||
"model.state_dict()['model.{}.m.0.weight'.format(idx)].data -= model.state_dict()['model.{}.m.0.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.weight'.format(idx)].data -= model.state_dict()['model.{}.m.1.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.weight'.format(idx)].data -= model.state_dict()['model.{}.m.2.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.weight'.format(idx)].data -= model.state_dict()['model.{}.m.3.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.weight'.format(idx)].data += state_dict['model.{}.m.0.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.weight'.format(idx)].data += state_dict['model.{}.m.1.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.weight'.format(idx)].data += state_dict['model.{}.m.2.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.weight'.format(idx)].data += state_dict['model.{}.m.3.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data -= model.state_dict()['model.{}.m.0.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data -= model.state_dict()['model.{}.m.1.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data -= model.state_dict()['model.{}.m.2.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data -= model.state_dict()['model.{}.m.3.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data += state_dict['model.{}.m.0.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data += state_dict['model.{}.m.1.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data += state_dict['model.{}.m.2.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data += state_dict['model.{}.m.3.bias'.format(idx2)].data\n",
|
||||
"\n",
|
||||
"# reparametrized YOLOR\n",
|
||||
"for i in range(255):\n",
|
||||
" model.state_dict()['model.{}.m.0.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.0.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.1.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.1.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.2.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.2.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.3.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.3.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data += state_dict['model.{}.m.0.weight'.format(idx2)].mul(state_dict['model.{}.ia.0.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data += state_dict['model.{}.m.1.weight'.format(idx2)].mul(state_dict['model.{}.ia.1.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data += state_dict['model.{}.m.2.weight'.format(idx2)].mul(state_dict['model.{}.ia.2.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data += state_dict['model.{}.m.3.weight'.format(idx2)].mul(state_dict['model.{}.ia.3.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data *= state_dict['model.{}.im.0.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data *= state_dict['model.{}.im.1.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data *= state_dict['model.{}.im.2.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data *= state_dict['model.{}.im.3.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"\n",
|
||||
"# model to be saved\n",
|
||||
"ckpt = {'model': deepcopy(model.module if is_parallel(model) else model).half(),\n",
|
||||
" 'optimizer': None,\n",
|
||||
" 'training_results': None,\n",
|
||||
" 'epoch': -1}\n",
|
||||
"\n",
|
||||
"# save reparameterized model\n",
|
||||
"torch.save(ckpt, 'cfg/deploy/yolov7-d6.pt')\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "334c273b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## YOLOv7-E6E reparameterization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "635fd8d2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# import\n",
|
||||
"from copy import deepcopy\n",
|
||||
"from models.yolo import Model\n",
|
||||
"import torch\n",
|
||||
"from utils.torch_utils import select_device, is_parallel\n",
|
||||
"\n",
|
||||
"device = select_device('0', batch_size=1)\n",
|
||||
"# model trained by cfg/training/*.yaml\n",
|
||||
"ckpt = torch.load('cfg/training/yolov7-e6e.pt', map_location=device)\n",
|
||||
"# reparameterized model in cfg/deploy/*.yaml\n",
|
||||
"model = Model('cfg/deploy/yolov7-e6e.yaml', ch=3, nc=80).to(device)\n",
|
||||
"\n",
|
||||
"# copy intersect weights\n",
|
||||
"state_dict = ckpt['model'].float().state_dict()\n",
|
||||
"exclude = []\n",
|
||||
"intersect_state_dict = {k: v for k, v in state_dict.items() if k in model.state_dict() and not any(x in k for x in exclude) and v.shape == model.state_dict()[k].shape}\n",
|
||||
"model.load_state_dict(intersect_state_dict, strict=False)\n",
|
||||
"model.names = ckpt['model'].names\n",
|
||||
"model.nc = ckpt['model'].nc\n",
|
||||
"\n",
|
||||
"idx = 261\n",
|
||||
"idx2 = 265\n",
|
||||
"\n",
|
||||
"# copy weights of lead head\n",
|
||||
"model.state_dict()['model.{}.m.0.weight'.format(idx)].data -= model.state_dict()['model.{}.m.0.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.weight'.format(idx)].data -= model.state_dict()['model.{}.m.1.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.weight'.format(idx)].data -= model.state_dict()['model.{}.m.2.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.weight'.format(idx)].data -= model.state_dict()['model.{}.m.3.weight'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.weight'.format(idx)].data += state_dict['model.{}.m.0.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.weight'.format(idx)].data += state_dict['model.{}.m.1.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.weight'.format(idx)].data += state_dict['model.{}.m.2.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.weight'.format(idx)].data += state_dict['model.{}.m.3.weight'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data -= model.state_dict()['model.{}.m.0.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data -= model.state_dict()['model.{}.m.1.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data -= model.state_dict()['model.{}.m.2.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data -= model.state_dict()['model.{}.m.3.bias'.format(idx)].data\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data += state_dict['model.{}.m.0.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data += state_dict['model.{}.m.1.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data += state_dict['model.{}.m.2.bias'.format(idx2)].data\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data += state_dict['model.{}.m.3.bias'.format(idx2)].data\n",
|
||||
"\n",
|
||||
"# reparametrized YOLOR\n",
|
||||
"for i in range(255):\n",
|
||||
" model.state_dict()['model.{}.m.0.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.0.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.1.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.1.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.2.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.2.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
" model.state_dict()['model.{}.m.3.weight'.format(idx)].data[i, :, :, :] *= state_dict['model.{}.im.3.implicit'.format(idx2)].data[:, i, : :].squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data += state_dict['model.{}.m.0.weight'.format(idx2)].mul(state_dict['model.{}.ia.0.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data += state_dict['model.{}.m.1.weight'.format(idx2)].mul(state_dict['model.{}.ia.1.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data += state_dict['model.{}.m.2.weight'.format(idx2)].mul(state_dict['model.{}.ia.2.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data += state_dict['model.{}.m.3.weight'.format(idx2)].mul(state_dict['model.{}.ia.3.implicit'.format(idx2)]).sum(1).squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.0.bias'.format(idx)].data *= state_dict['model.{}.im.0.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.1.bias'.format(idx)].data *= state_dict['model.{}.im.1.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.2.bias'.format(idx)].data *= state_dict['model.{}.im.2.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"model.state_dict()['model.{}.m.3.bias'.format(idx)].data *= state_dict['model.{}.im.3.implicit'.format(idx2)].data.squeeze()\n",
|
||||
"\n",
|
||||
"# model to be saved\n",
|
||||
"ckpt = {'model': deepcopy(model.module if is_parallel(model) else model).half(),\n",
|
||||
" 'optimizer': None,\n",
|
||||
" 'training_results': None,\n",
|
||||
" 'epoch': -1}\n",
|
||||
"\n",
|
||||
"# save reparameterized model\n",
|
||||
"torch.save(ckpt, 'cfg/deploy/yolov7-e6e.pt')\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "63a62625",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.8.10"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
32
yolov7-tracker-example/tracker/config_files/mot17.yaml
Normal file
32
yolov7-tracker-example/tracker/config_files/mot17.yaml
Normal file
@ -0,0 +1,32 @@
|
||||
# Config file of MOT17 dataset
|
||||
|
||||
DATASET_ROOT: '/data/wujiapeng/datasets/MOT17' # your dataset root
|
||||
SPLIT: test
|
||||
CATEGORY_NAMES: # category names to show
|
||||
- 'pedestrian'
|
||||
|
||||
CATEGORY_DICT:
|
||||
0: 'pedestrian'
|
||||
|
||||
CERTAIN_SEQS:
|
||||
-
|
||||
IGNORE_SEQS: # Seqs you want to ignore
|
||||
-
|
||||
|
||||
YAML_DICT: '' # NOTE: ONLY for yolo v5 model loader(func DetectMultiBackend)
|
||||
|
||||
TRACK_EVAL: # If use TrackEval to evaluate, use these configs
|
||||
'DISPLAY_LESS_PROGRESS': False
|
||||
'GT_FOLDER': '/data/wujiapeng/datasets/MOT17/train'
|
||||
'TRACKERS_FOLDER': './tracker/results'
|
||||
'SKIP_SPLIT_FOL': True
|
||||
'TRACKER_SUB_FOLDER': ''
|
||||
'SEQ_INFO':
|
||||
'MOT17-02-SDP': null
|
||||
'MOT17-04-SDP': null
|
||||
'MOT17-05-SDP': null
|
||||
'MOT17-09-SDP': null
|
||||
'MOT17-10-SDP': null
|
||||
'MOT17-11-SDP': null
|
||||
'MOT17-13-SDP': null
|
||||
'GT_LOC_FORMAT': '{gt_folder}/{seq}/gt/gt.txt'
|
26
yolov7-tracker-example/tracker/config_files/uavdt.yaml
Normal file
26
yolov7-tracker-example/tracker/config_files/uavdt.yaml
Normal file
@ -0,0 +1,26 @@
|
||||
# Config file of UAVDT dataset
|
||||
|
||||
DATASET_ROOT: '/data/wujiapeng/datasets/UAVDT' # your dataset root
|
||||
SPLIT: test
|
||||
CATEGORY_NAMES: # category names to show
|
||||
- 'car'
|
||||
|
||||
CATEGORY_DICT:
|
||||
0: 'car'
|
||||
|
||||
CERTAIN_SEQS:
|
||||
-
|
||||
IGNORE_SEQS: # Seqs you want to ignore
|
||||
-
|
||||
|
||||
YAML_DICT: './data/UAVDT.yaml' # NOTE: ONLY for yolo v5 model loader(func DetectMultiBackend)
|
||||
|
||||
TRACK_EVAL: # If use TrackEval to evaluate, use these configs
|
||||
'DISPLAY_LESS_PROGRESS': False
|
||||
'GT_FOLDER': '/data/wujiapeng/datasets/UAVDT/UAV-benchmark-M'
|
||||
'TRACKERS_FOLDER': './tracker/results'
|
||||
'SKIP_SPLIT_FOL': True
|
||||
'TRACKER_SUB_FOLDER': ''
|
||||
'SEQ_INFO':
|
||||
'M0101': 407
|
||||
'GT_LOC_FORMAT': '{gt_folder}/{seq}/gt/gt.txt'
|
61
yolov7-tracker-example/tracker/config_files/visdrone.yaml
Normal file
61
yolov7-tracker-example/tracker/config_files/visdrone.yaml
Normal file
@ -0,0 +1,61 @@
|
||||
# Config file of VisDrone dataset
|
||||
|
||||
DATASET_ROOT: '/data/wujiapeng/datasets/VisDrone2019/VisDrone2019'
|
||||
SPLIT: test
|
||||
CATEGORY_NAMES:
|
||||
- 'pedestrain'
|
||||
- 'people'
|
||||
- 'bicycle'
|
||||
- 'car'
|
||||
- 'van'
|
||||
- 'truck'
|
||||
- 'tricycle'
|
||||
- 'awning-tricycle'
|
||||
- 'bus'
|
||||
- 'motor'
|
||||
|
||||
CATEGORY_DICT:
|
||||
0: 'pedestrain'
|
||||
1: 'people'
|
||||
2: 'bicycle'
|
||||
3: 'car'
|
||||
4: 'van'
|
||||
5: 'truck'
|
||||
6: 'tricycle'
|
||||
7: 'awning-tricycle'
|
||||
8: 'bus'
|
||||
9: 'motor'
|
||||
|
||||
CERTAIN_SEQS:
|
||||
-
|
||||
|
||||
IGNORE_SEQS: # Seqs you want to ignore
|
||||
-
|
||||
|
||||
YAML_DICT: './data/Visdrone_all.yaml' # NOTE: ONLY for yolo v5 model loader(func DetectMultiBackend)
|
||||
|
||||
TRACK_EVAL: # If use TrackEval to evaluate, use these configs
|
||||
'DISPLAY_LESS_PROGRESS': False
|
||||
'GT_FOLDER': '/data/wujiapeng/datasets/VisDrone2019/VisDrone2019/VisDrone2019-MOT-test-dev/annotations'
|
||||
'TRACKERS_FOLDER': './tracker/results'
|
||||
'SKIP_SPLIT_FOL': True
|
||||
'TRACKER_SUB_FOLDER': ''
|
||||
'SEQ_INFO':
|
||||
'uav0000009_03358_v': 219
|
||||
'uav0000073_00600_v': 328
|
||||
'uav0000073_04464_v': 312
|
||||
'uav0000077_00720_v': 780
|
||||
'uav0000088_00290_v': 296
|
||||
'uav0000119_02301_v': 179
|
||||
'uav0000120_04775_v': 1000
|
||||
'uav0000161_00000_v': 308
|
||||
'uav0000188_00000_v': 260
|
||||
'uav0000201_00000_v': 677
|
||||
'uav0000249_00001_v': 360
|
||||
'uav0000249_02688_v': 244
|
||||
'uav0000297_00000_v': 146
|
||||
'uav0000297_02761_v': 373
|
||||
'uav0000306_00230_v': 420
|
||||
'uav0000355_00001_v': 468
|
||||
'uav0000370_00001_v': 265
|
||||
'GT_LOC_FORMAT': '{gt_folder}/{seq}.txt'
|
@ -0,0 +1,51 @@
|
||||
# Config file of VisDrone dataset
|
||||
|
||||
DATASET_ROOT: '/data/wujiapeng/datasets/VisDrone2019/VisDrone2019'
|
||||
SPLIT: test
|
||||
CATEGORY_NAMES:
|
||||
- 'pedestrain'
|
||||
- 'car'
|
||||
- 'van'
|
||||
- 'truck'
|
||||
- 'bus'
|
||||
|
||||
CATEGORY_DICT:
|
||||
0: 'pedestrain'
|
||||
1: 'car'
|
||||
2: 'van'
|
||||
3: 'truck'
|
||||
4: 'bus'
|
||||
|
||||
CERTAIN_SEQS:
|
||||
-
|
||||
|
||||
IGNORE_SEQS: # Seqs you want to ignore
|
||||
-
|
||||
|
||||
YAML_DICT: './data/Visdrone_all.yaml' # NOTE: ONLY for yolo v5 model loader(func DetectMultiBackend)
|
||||
|
||||
TRACK_EVAL: # If use TrackEval to evaluate, use these configs
|
||||
'DISPLAY_LESS_PROGRESS': False
|
||||
'GT_FOLDER': '/data/wujiapeng/datasets/VisDrone2019/VisDrone2019/VisDrone2019-MOT-test-dev/annotations'
|
||||
'TRACKERS_FOLDER': './tracker/results'
|
||||
'SKIP_SPLIT_FOL': True
|
||||
'TRACKER_SUB_FOLDER': ''
|
||||
'SEQ_INFO':
|
||||
'uav0000009_03358_v': 219
|
||||
'uav0000073_00600_v': 328
|
||||
'uav0000073_04464_v': 312
|
||||
'uav0000077_00720_v': 780
|
||||
'uav0000088_00290_v': 296
|
||||
'uav0000119_02301_v': 179
|
||||
'uav0000120_04775_v': 1000
|
||||
'uav0000161_00000_v': 308
|
||||
'uav0000188_00000_v': 260
|
||||
'uav0000201_00000_v': 677
|
||||
'uav0000249_00001_v': 360
|
||||
'uav0000249_02688_v': 244
|
||||
'uav0000297_00000_v': 146
|
||||
'uav0000297_02761_v': 373
|
||||
'uav0000306_00230_v': 420
|
||||
'uav0000355_00001_v': 468
|
||||
'uav0000370_00001_v': 265
|
||||
'GT_LOC_FORMAT': '{gt_folder}/{seq}.txt'
|
37
yolov7-tracker-example/tracker/my_timer.py
Normal file
37
yolov7-tracker-example/tracker/my_timer.py
Normal file
@ -0,0 +1,37 @@
|
||||
import time
|
||||
|
||||
|
||||
class Timer(object):
|
||||
"""A simple timer."""
|
||||
def __init__(self):
|
||||
self.total_time = 0.
|
||||
self.calls = 0
|
||||
self.start_time = 0.
|
||||
self.diff = 0.
|
||||
self.average_time = 0.
|
||||
|
||||
self.duration = 0.
|
||||
|
||||
def tic(self):
|
||||
# using time.time instead of time.clock because time time.clock
|
||||
# does not normalize for multithreading
|
||||
self.start_time = time.time()
|
||||
|
||||
def toc(self, average=True):
|
||||
self.diff = time.time() - self.start_time
|
||||
self.total_time += self.diff
|
||||
self.calls += 1
|
||||
self.average_time = self.total_time / self.calls
|
||||
if average:
|
||||
self.duration = self.average_time
|
||||
else:
|
||||
self.duration = self.diff
|
||||
return self.duration
|
||||
|
||||
def clear(self):
|
||||
self.total_time = 0.
|
||||
self.calls = 0
|
||||
self.start_time = 0.
|
||||
self.diff = 0.
|
||||
self.average_time = 0.
|
||||
self.duration = 0.
|
305
yolov7-tracker-example/tracker/track.py
Normal file
305
yolov7-tracker-example/tracker/track.py
Normal file
@ -0,0 +1,305 @@
|
||||
"""
|
||||
main code for track
|
||||
"""
|
||||
import sys, os
|
||||
import numpy as np
|
||||
import torch
|
||||
import cv2
|
||||
from PIL import Image
|
||||
from tqdm import tqdm
|
||||
import yaml
|
||||
|
||||
from loguru import logger
|
||||
import argparse
|
||||
|
||||
from tracking_utils.envs import select_device
|
||||
from tracking_utils.tools import *
|
||||
from tracking_utils.visualization import plot_img, save_video
|
||||
from my_timer import Timer
|
||||
|
||||
from tracker_dataloader import TestDataset
|
||||
|
||||
# trackers
|
||||
from trackers.byte_tracker import ByteTracker
|
||||
from trackers.sort_tracker import SortTracker
|
||||
from trackers.botsort_tracker import BotTracker
|
||||
from trackers.c_biou_tracker import C_BIoUTracker
|
||||
from trackers.ocsort_tracker import OCSortTracker
|
||||
from trackers.deepsort_tracker import DeepSortTracker
|
||||
from trackers.strongsort_tracker import StrongSortTracker
|
||||
from trackers.sparse_tracker import SparseTracker
|
||||
|
||||
# YOLOX modules
|
||||
try:
|
||||
from yolox.exp import get_exp
|
||||
from yolox_utils.postprocess import postprocess_yolox
|
||||
from yolox.utils import fuse_model
|
||||
except Exception as e:
|
||||
logger.warning(e)
|
||||
logger.warning('Load yolox fail. If you want to use yolox, please check the installation.')
|
||||
pass
|
||||
|
||||
# YOLOv7 modules
|
||||
try:
|
||||
sys.path.append(os.getcwd())
|
||||
from models.experimental import attempt_load
|
||||
from utils.torch_utils import select_device, time_synchronized, TracedModel
|
||||
from utils.general import non_max_suppression, scale_coords, check_img_size
|
||||
from yolov7_utils.postprocess import postprocess as postprocess_yolov7
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(e)
|
||||
logger.warning('Load yolov7 fail. If you want to use yolov7, please check the installation.')
|
||||
pass
|
||||
|
||||
# YOLOv8 modules
|
||||
try:
|
||||
from ultralytics import YOLO
|
||||
from yolov8_utils.postprocess import postprocess as postprocess_yolov8
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(e)
|
||||
logger.warning('Load yolov8 fail. If you want to use yolov8, please check the installation.')
|
||||
pass
|
||||
|
||||
TRACKER_DICT = {
|
||||
'sort': SortTracker,
|
||||
'bytetrack': ByteTracker,
|
||||
'botsort': BotTracker,
|
||||
'c_bioutrack': C_BIoUTracker,
|
||||
'ocsort': OCSortTracker,
|
||||
'deepsort': DeepSortTracker,
|
||||
'strongsort': StrongSortTracker,
|
||||
'sparsetrack': SparseTracker
|
||||
}
|
||||
|
||||
def get_args():
|
||||
|
||||
parser = argparse.ArgumentParser()
|
||||
|
||||
"""general"""
|
||||
parser.add_argument('--dataset', type=str, default='visdrone_part', help='visdrone, mot17, etc.')
|
||||
parser.add_argument('--detector', type=str, default='yolov8', help='yolov7, yolox, etc.')
|
||||
parser.add_argument('--tracker', type=str, default='sort', help='sort, deepsort, etc')
|
||||
parser.add_argument('--reid_model', type=str, default='osnet_x0_25', help='osnet or deppsort')
|
||||
|
||||
parser.add_argument('--kalman_format', type=str, default='default', help='use what kind of Kalman, sort, deepsort, byte, etc.')
|
||||
parser.add_argument('--img_size', type=int, default=1280, help='image size, [h, w]')
|
||||
|
||||
parser.add_argument('--conf_thresh', type=float, default=0.2, help='filter tracks')
|
||||
parser.add_argument('--nms_thresh', type=float, default=0.7, help='thresh for NMS')
|
||||
parser.add_argument('--iou_thresh', type=float, default=0.5, help='IOU thresh to filter tracks')
|
||||
|
||||
parser.add_argument('--device', type=str, default='6', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
|
||||
|
||||
"""yolox"""
|
||||
parser.add_argument('--yolox_exp_file', type=str, default='./tracker/yolox_utils/yolox_m.py')
|
||||
|
||||
"""model path"""
|
||||
parser.add_argument('--detector_model_path', type=str, default='./weights/best.pt', help='model path')
|
||||
parser.add_argument('--trace', type=bool, default=False, help='traced model of YOLO v7')
|
||||
# other model path
|
||||
parser.add_argument('--reid_model_path', type=str, default='./weights/osnet_x0_25.pth', help='path for reid model path')
|
||||
parser.add_argument('--dhn_path', type=str, default='./weights/DHN.pth', help='path of DHN path for DeepMOT')
|
||||
|
||||
|
||||
"""other options"""
|
||||
parser.add_argument('--discard_reid', action='store_true', help='discard reid model, only work in bot-sort etc. which need a reid part')
|
||||
parser.add_argument('--track_buffer', type=int, default=30, help='tracking buffer')
|
||||
parser.add_argument('--gamma', type=float, default=0.1, help='param to control fusing motion and apperance dist')
|
||||
parser.add_argument('--min_area', type=float, default=150, help='use to filter small bboxs')
|
||||
|
||||
parser.add_argument('--save_dir', type=str, default='track_results/{dataset_name}/{split}')
|
||||
parser.add_argument('--save_images', action='store_true', help='save tracking results (image)')
|
||||
parser.add_argument('--save_videos', action='store_true', help='save tracking results (video)')
|
||||
|
||||
parser.add_argument('--track_eval', type=bool, default=True, help='Use TrackEval to evaluate')
|
||||
|
||||
return parser.parse_args()
|
||||
|
||||
def main(args, dataset_cfgs):
|
||||
|
||||
"""1. set some params"""
|
||||
|
||||
# NOTE: if save video, you must save image
|
||||
if args.save_videos:
|
||||
args.save_images = True
|
||||
|
||||
"""2. load detector"""
|
||||
device = select_device(args.device)
|
||||
|
||||
if args.detector == 'yolox':
|
||||
|
||||
exp = get_exp(args.yolox_exp_file, None) # TODO: modify num_classes etc. for specific dataset
|
||||
model_img_size = exp.input_size
|
||||
model = exp.get_model()
|
||||
model.to(device)
|
||||
model.eval()
|
||||
|
||||
logger.info(f"loading detector {args.detector} checkpoint {args.detector_model_path}")
|
||||
ckpt = torch.load(args.detector_model_path, map_location=device)
|
||||
model.load_state_dict(ckpt['model'])
|
||||
logger.info("loaded checkpoint done")
|
||||
model = fuse_model(model)
|
||||
|
||||
stride = None # match with yolo v7
|
||||
|
||||
logger.info(f'Now detector is on device {next(model.parameters()).device}')
|
||||
|
||||
elif args.detector == 'yolov7':
|
||||
|
||||
logger.info(f"loading detector {args.detector} checkpoint {args.detector_model_path}")
|
||||
model = attempt_load(args.detector_model_path, map_location=device)
|
||||
|
||||
# get inference img size
|
||||
stride = int(model.stride.max()) # model stride
|
||||
model_img_size = check_img_size(args.img_size, s=stride) # check img_size
|
||||
|
||||
# Traced model
|
||||
model = TracedModel(model, device=device, img_size=args.img_size)
|
||||
# model.half()
|
||||
|
||||
logger.info("loaded checkpoint done")
|
||||
|
||||
logger.info(f'Now detector is on device {next(model.parameters()).device}')
|
||||
|
||||
elif args.detector == 'yolov8':
|
||||
|
||||
logger.info(f"loading detector {args.detector} checkpoint {args.detector_model_path}")
|
||||
model = YOLO(args.detector_model_path)
|
||||
|
||||
model_img_size = [None, None]
|
||||
stride = None
|
||||
|
||||
logger.info("loaded checkpoint done")
|
||||
|
||||
else:
|
||||
logger.error(f"detector {args.detector} is not supprted")
|
||||
exit(0)
|
||||
|
||||
"""3. load sequences"""
|
||||
DATA_ROOT = dataset_cfgs['DATASET_ROOT']
|
||||
SPLIT = dataset_cfgs['SPLIT']
|
||||
|
||||
seqs = sorted(os.listdir(os.path.join(DATA_ROOT, 'images', SPLIT)))
|
||||
seqs = [seq for seq in seqs if seq not in dataset_cfgs['IGNORE_SEQS']]
|
||||
if not None in dataset_cfgs['CERTAIN_SEQS']:
|
||||
seqs = dataset_cfgs['CERTAIN_SEQS']
|
||||
|
||||
logger.info(f'Total {len(seqs)} seqs will be tracked: {seqs}')
|
||||
|
||||
save_dir = args.save_dir.format(dataset_name=args.dataset, split=SPLIT)
|
||||
|
||||
|
||||
"""4. Tracking"""
|
||||
|
||||
# set timer
|
||||
timer = Timer()
|
||||
seq_fps = []
|
||||
|
||||
for seq in seqs:
|
||||
logger.info(f'--------------tracking seq {seq}--------------')
|
||||
|
||||
dataset = TestDataset(DATA_ROOT, SPLIT, seq_name=seq, img_size=model_img_size, model=args.detector, stride=stride)
|
||||
|
||||
data_loader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False)
|
||||
|
||||
tracker = TRACKER_DICT[args.tracker](args, )
|
||||
|
||||
process_bar = enumerate(data_loader)
|
||||
process_bar = tqdm(process_bar, total=len(data_loader), ncols=150)
|
||||
|
||||
results = []
|
||||
|
||||
for frame_idx, (ori_img, img) in process_bar:
|
||||
|
||||
# start timing this frame
|
||||
timer.tic()
|
||||
|
||||
if args.detector == 'yolov8':
|
||||
img = img.squeeze(0).cpu().numpy()
|
||||
|
||||
else:
|
||||
img = img.to(device) # (1, C, H, W)
|
||||
img = img.float()
|
||||
|
||||
ori_img = ori_img.squeeze(0)
|
||||
|
||||
# get detector output
|
||||
with torch.no_grad():
|
||||
if args.detector == 'yolov8':
|
||||
output = model.predict(img, conf=args.conf_thresh, iou=args.nms_thresh)
|
||||
else:
|
||||
output = model(img)
|
||||
|
||||
# postprocess output to original scales
|
||||
if args.detector == 'yolox':
|
||||
output = postprocess_yolox(output, len(dataset_cfgs['CATEGORY_NAMES']), conf_thresh=args.conf_thresh,
|
||||
img=img, ori_img=ori_img)
|
||||
|
||||
elif args.detector == 'yolov7':
|
||||
output = postprocess_yolov7(output, args.conf_thresh, args.nms_thresh, img.shape[2:], ori_img.shape)
|
||||
|
||||
elif args.detector == 'yolov8':
|
||||
output = postprocess_yolov8(output)
|
||||
|
||||
else: raise NotImplementedError
|
||||
|
||||
# output: (tlbr, conf, cls)
|
||||
# convert tlbr to tlwh
|
||||
if isinstance(output, torch.Tensor):
|
||||
output = output.detach().cpu().numpy()
|
||||
output[:, 2] -= output[:, 0]
|
||||
output[:, 3] -= output[:, 1]
|
||||
current_tracks = tracker.update(output, img, ori_img.cpu().numpy())
|
||||
|
||||
# save results
|
||||
cur_tlwh, cur_id, cur_cls, cur_score = [], [], [], []
|
||||
for trk in current_tracks:
|
||||
bbox = trk.tlwh
|
||||
id = trk.track_id
|
||||
cls = trk.category
|
||||
score = trk.score
|
||||
|
||||
# filter low area bbox
|
||||
if bbox[2] * bbox[3] > args.min_area:
|
||||
cur_tlwh.append(bbox)
|
||||
cur_id.append(id)
|
||||
cur_cls.append(cls)
|
||||
cur_score.append(score)
|
||||
# results.append((frame_id + 1, id, bbox, cls))
|
||||
|
||||
results.append((frame_idx + 1, cur_id, cur_tlwh, cur_cls, cur_score))
|
||||
|
||||
timer.toc()
|
||||
|
||||
if args.save_images:
|
||||
plot_img(img=ori_img, frame_id=frame_idx, results=[cur_tlwh, cur_id, cur_cls],
|
||||
save_dir=os.path.join(save_dir, 'vis_results'))
|
||||
|
||||
save_results(folder_name=os.path.join(args.dataset, SPLIT),
|
||||
seq_name=seq,
|
||||
results=results)
|
||||
|
||||
# show the fps
|
||||
seq_fps.append(frame_idx / timer.total_time)
|
||||
logger.info(f'fps of seq {seq}: {seq_fps[-1]}')
|
||||
timer.clear()
|
||||
|
||||
if args.save_videos:
|
||||
save_video(images_path=os.path.join(save_dir, 'vis_results'))
|
||||
logger.info(f'save video of {seq} done')
|
||||
|
||||
# show the average fps
|
||||
logger.info(f'average fps: {np.mean(seq_fps)}')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
args = get_args()
|
||||
|
||||
with open(f'./tracker/config_files/{args.dataset}.yaml', 'r') as f:
|
||||
cfgs = yaml.load(f, Loader=yaml.FullLoader)
|
||||
|
||||
|
||||
main(args, cfgs)
|
266
yolov7-tracker-example/tracker/track_demo.py
Normal file
266
yolov7-tracker-example/tracker/track_demo.py
Normal file
@ -0,0 +1,266 @@
|
||||
"""
|
||||
main code for track
|
||||
"""
|
||||
import sys, os
|
||||
import numpy as np
|
||||
import torch
|
||||
import cv2
|
||||
from PIL import Image
|
||||
from tqdm import tqdm
|
||||
import yaml
|
||||
|
||||
from loguru import logger
|
||||
import argparse
|
||||
|
||||
from tracking_utils.envs import select_device
|
||||
from tracking_utils.tools import *
|
||||
from tracking_utils.visualization import plot_img, save_video
|
||||
|
||||
from tracker_dataloader import TestDataset, DemoDataset
|
||||
|
||||
# trackers
|
||||
from trackers.byte_tracker import ByteTracker
|
||||
from trackers.sort_tracker import SortTracker
|
||||
from trackers.botsort_tracker import BotTracker
|
||||
from trackers.c_biou_tracker import C_BIoUTracker
|
||||
from trackers.ocsort_tracker import OCSortTracker
|
||||
from trackers.deepsort_tracker import DeepSortTracker
|
||||
|
||||
# YOLOX modules
|
||||
try:
|
||||
from yolox.exp import get_exp
|
||||
from yolox_utils.postprocess import postprocess_yolox
|
||||
from yolox.utils import fuse_model
|
||||
except Exception as e:
|
||||
logger.warning(e)
|
||||
logger.warning('Load yolox fail. If you want to use yolox, please check the installation.')
|
||||
pass
|
||||
|
||||
# YOLOv7 modules
|
||||
try:
|
||||
sys.path.append(os.getcwd())
|
||||
from models.experimental import attempt_load
|
||||
from utils.torch_utils import select_device, time_synchronized, TracedModel
|
||||
from utils.general import non_max_suppression, scale_coords, check_img_size
|
||||
from yolov7_utils.postprocess import postprocess as postprocess_yolov7
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(e)
|
||||
logger.warning('Load yolov7 fail. If you want to use yolov7, please check the installation.')
|
||||
pass
|
||||
|
||||
# YOLOv8 modules
|
||||
try:
|
||||
from ultralytics import YOLO
|
||||
from yolov8_utils.postprocess import postprocess as postprocess_yolov8
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(e)
|
||||
logger.warning('Load yolov8 fail. If you want to use yolov8, please check the installation.')
|
||||
pass
|
||||
|
||||
TRACKER_DICT = {
|
||||
'sort': SortTracker,
|
||||
'bytetrack': ByteTracker,
|
||||
'botsort': BotTracker,
|
||||
'c_bioutrack': C_BIoUTracker,
|
||||
'ocsort': OCSortTracker,
|
||||
'deepsort': DeepSortTracker
|
||||
}
|
||||
|
||||
def get_args():
|
||||
|
||||
parser = argparse.ArgumentParser()
|
||||
|
||||
"""general"""
|
||||
parser.add_argument('--obj', type=str, required=True, default='demo.mp4', help='video or images folder PATH')
|
||||
|
||||
parser.add_argument('--detector', type=str, default='yolov8', help='yolov7, yolox, etc.')
|
||||
parser.add_argument('--tracker', type=str, default='sort', help='sort, deepsort, etc')
|
||||
parser.add_argument('--reid_model', type=str, default='osnet_x0_25', help='osnet or deppsort')
|
||||
|
||||
parser.add_argument('--kalman_format', type=str, default='default', help='use what kind of Kalman, sort, deepsort, byte, etc.')
|
||||
parser.add_argument('--img_size', type=int, default=1280, help='image size, [h, w]')
|
||||
|
||||
parser.add_argument('--conf_thresh', type=float, default=0.2, help='filter tracks')
|
||||
parser.add_argument('--nms_thresh', type=float, default=0.7, help='thresh for NMS')
|
||||
parser.add_argument('--iou_thresh', type=float, default=0.5, help='IOU thresh to filter tracks')
|
||||
|
||||
parser.add_argument('--device', type=str, default='6', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
|
||||
|
||||
"""yolox"""
|
||||
parser.add_argument('--num_classes', type=int, default=1)
|
||||
parser.add_argument('--yolox_exp_file', type=str, default='./tracker/yolox_utils/yolox_m.py')
|
||||
|
||||
"""model path"""
|
||||
parser.add_argument('--detector_model_path', type=str, default='./weights/best.pt', help='model path')
|
||||
parser.add_argument('--trace', type=bool, default=False, help='traced model of YOLO v7')
|
||||
# other model path
|
||||
parser.add_argument('--reid_model_path', type=str, default='./weights/osnet_x0_25.pth', help='path for reid model path')
|
||||
parser.add_argument('--dhn_path', type=str, default='./weights/DHN.pth', help='path of DHN path for DeepMOT')
|
||||
|
||||
|
||||
"""other options"""
|
||||
parser.add_argument('--discard_reid', action='store_true', help='discard reid model, only work in bot-sort etc. which need a reid part')
|
||||
parser.add_argument('--track_buffer', type=int, default=30, help='tracking buffer')
|
||||
parser.add_argument('--gamma', type=float, default=0.1, help='param to control fusing motion and apperance dist')
|
||||
parser.add_argument('--min_area', type=float, default=150, help='use to filter small bboxs')
|
||||
|
||||
parser.add_argument('--save_dir', type=str, default='track_demo_results')
|
||||
parser.add_argument('--save_images', action='store_true', help='save tracking results (image)')
|
||||
parser.add_argument('--save_videos', action='store_true', help='save tracking results (video)')
|
||||
|
||||
parser.add_argument('--track_eval', type=bool, default=True, help='Use TrackEval to evaluate')
|
||||
|
||||
return parser.parse_args()
|
||||
|
||||
def main(args):
|
||||
|
||||
"""1. set some params"""
|
||||
|
||||
# NOTE: if save video, you must save image
|
||||
if args.save_videos:
|
||||
args.save_images = True
|
||||
|
||||
"""2. load detector"""
|
||||
device = select_device(args.device)
|
||||
|
||||
if args.detector == 'yolox':
|
||||
|
||||
exp = get_exp(args.yolox_exp_file, None) # TODO: modify num_classes etc. for specific dataset
|
||||
model_img_size = exp.input_size
|
||||
model = exp.get_model()
|
||||
model.to(device)
|
||||
model.eval()
|
||||
|
||||
logger.info(f"loading detector {args.detector} checkpoint {args.detector_model_path}")
|
||||
ckpt = torch.load(args.detector_model_path, map_location=device)
|
||||
model.load_state_dict(ckpt['model'])
|
||||
logger.info("loaded checkpoint done")
|
||||
model = fuse_model(model)
|
||||
|
||||
stride = None # match with yolo v7
|
||||
|
||||
logger.info(f'Now detector is on device {next(model.parameters()).device}')
|
||||
|
||||
elif args.detector == 'yolov7':
|
||||
|
||||
logger.info(f"loading detector {args.detector} checkpoint {args.detector_model_path}")
|
||||
model = attempt_load(args.detector_model_path, map_location=device)
|
||||
|
||||
# get inference img size
|
||||
stride = int(model.stride.max()) # model stride
|
||||
model_img_size = check_img_size(args.img_size, s=stride) # check img_size
|
||||
|
||||
# Traced model
|
||||
model = TracedModel(model, device=device, img_size=args.img_size)
|
||||
# model.half()
|
||||
|
||||
logger.info("loaded checkpoint done")
|
||||
|
||||
logger.info(f'Now detector is on device {next(model.parameters()).device}')
|
||||
|
||||
elif args.detector == 'yolov8':
|
||||
|
||||
logger.info(f"loading detector {args.detector} checkpoint {args.detector_model_path}")
|
||||
model = YOLO(args.detector_model_path)
|
||||
|
||||
model_img_size = [None, None]
|
||||
stride = None
|
||||
|
||||
logger.info("loaded checkpoint done")
|
||||
|
||||
else:
|
||||
logger.error(f"detector {args.detector} is not supprted")
|
||||
exit(0)
|
||||
|
||||
"""3. load sequences"""
|
||||
|
||||
dataset = DemoDataset(file_name=args.obj, img_size=model_img_size, model=args.detector, stride=stride, )
|
||||
data_loader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False)
|
||||
|
||||
tracker = TRACKER_DICT[args.tracker](args, )
|
||||
|
||||
|
||||
save_dir = args.save_dir
|
||||
|
||||
process_bar = enumerate(data_loader)
|
||||
process_bar = tqdm(process_bar, total=len(data_loader), ncols=150)
|
||||
|
||||
results = []
|
||||
|
||||
"""4. Tracking"""
|
||||
|
||||
for frame_idx, (ori_img, img) in process_bar:
|
||||
if args.detector == 'yolov8':
|
||||
img = img.squeeze(0).cpu().numpy()
|
||||
|
||||
else:
|
||||
img = img.to(device) # (1, C, H, W)
|
||||
img = img.float()
|
||||
|
||||
ori_img = ori_img.squeeze(0)
|
||||
|
||||
# get detector output
|
||||
with torch.no_grad():
|
||||
if args.detector == 'yolov8':
|
||||
output = model.predict(img, conf=args.conf_thresh, iou=args.nms_thresh)
|
||||
else:
|
||||
output = model(img)
|
||||
|
||||
# postprocess output to original scales
|
||||
if args.detector == 'yolox':
|
||||
output = postprocess_yolox(output, args.num_classes, conf_thresh=args.conf_thresh,
|
||||
img=img, ori_img=ori_img)
|
||||
|
||||
elif args.detector == 'yolov7':
|
||||
output = postprocess_yolov7(output, args.conf_thresh, args.nms_thresh, img.shape[2:], ori_img.shape)
|
||||
|
||||
elif args.detector == 'yolov8':
|
||||
output = postprocess_yolov8(output)
|
||||
|
||||
else: raise NotImplementedError
|
||||
|
||||
# output: (tlbr, conf, cls)
|
||||
# convert tlbr to tlwh
|
||||
if isinstance(output, torch.Tensor):
|
||||
output = output.detach().cpu().numpy()
|
||||
output[:, 2] -= output[:, 0]
|
||||
output[:, 3] -= output[:, 1]
|
||||
current_tracks = tracker.update(output, img, ori_img.cpu().numpy())
|
||||
|
||||
# save results
|
||||
cur_tlwh, cur_id, cur_cls, cur_score = [], [], [], []
|
||||
for trk in current_tracks:
|
||||
bbox = trk.tlwh
|
||||
id = trk.track_id
|
||||
cls = trk.category
|
||||
score = trk.score
|
||||
|
||||
# filter low area bbox
|
||||
if bbox[2] * bbox[3] > args.min_area:
|
||||
cur_tlwh.append(bbox)
|
||||
cur_id.append(id)
|
||||
cur_cls.append(cls)
|
||||
cur_score.append(score)
|
||||
# results.append((frame_id + 1, id, bbox, cls))
|
||||
|
||||
results.append((frame_idx + 1, cur_id, cur_tlwh, cur_cls, cur_score))
|
||||
|
||||
if args.save_images:
|
||||
plot_img(img=ori_img, frame_id=frame_idx, results=[cur_tlwh, cur_id, cur_cls],
|
||||
save_dir=os.path.join(save_dir, 'vis_results'))
|
||||
|
||||
save_results(folder_name=os.path.join(save_dir, 'txt_results'),
|
||||
seq_name='demo',
|
||||
results=results)
|
||||
|
||||
if args.save_videos:
|
||||
save_video(images_path=os.path.join(save_dir, 'vis_results'))
|
||||
logger.info(f'save video done')
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
args = get_args()
|
||||
|
||||
main(args)
|
223
yolov7-tracker-example/tracker/tracker_dataloader.py
Normal file
223
yolov7-tracker-example/tracker/tracker_dataloader.py
Normal file
@ -0,0 +1,223 @@
|
||||
import numpy as np
|
||||
import torch
|
||||
import cv2
|
||||
import os
|
||||
import os.path as osp
|
||||
|
||||
from torch.utils.data import Dataset
|
||||
|
||||
|
||||
class TestDataset(Dataset):
|
||||
""" This class generate origin image, preprocessed image for inference
|
||||
NOTE: for every sequence, initialize a TestDataset class
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, data_root, split, seq_name, img_size=[640, 640], legacy_yolox=True, model='yolox', **kwargs) -> None:
|
||||
"""
|
||||
Args:
|
||||
data_root: path for entire dataset
|
||||
seq_name: name of sequence
|
||||
img_size: List[int, int] | Tuple[int, int] image size for detection model
|
||||
legacy_yolox: bool, to be compatible with older versions of yolox
|
||||
model: detection model, currently support x, v7, v8
|
||||
"""
|
||||
super().__init__()
|
||||
|
||||
self.model = model
|
||||
|
||||
self.data_root = data_root
|
||||
self.seq_name = seq_name
|
||||
self.img_size = img_size
|
||||
self.split = split
|
||||
|
||||
self.seq_path = osp.join(self.data_root, 'images', self.split, self.seq_name)
|
||||
self.imgs_in_seq = sorted(os.listdir(self.seq_path))
|
||||
|
||||
self.legacy = legacy_yolox
|
||||
|
||||
self.other_param = kwargs
|
||||
|
||||
def __getitem__(self, idx):
|
||||
|
||||
if self.model == 'yolox':
|
||||
return self._getitem_yolox(idx)
|
||||
elif self.model == 'yolov7':
|
||||
return self._getitem_yolov7(idx)
|
||||
elif self.model == 'yolov8':
|
||||
return self._getitem_yolov8(idx)
|
||||
|
||||
def _getitem_yolox(self, idx):
|
||||
|
||||
img = cv2.imread(osp.join(self.seq_path, self.imgs_in_seq[idx]))
|
||||
img_resized, _ = self._preprocess_yolox(img, self.img_size, )
|
||||
if self.legacy:
|
||||
img_resized = img_resized[::-1, :, :].copy() # BGR -> RGB
|
||||
img_resized /= 255.0
|
||||
img_resized -= np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1)
|
||||
img_resized /= np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1)
|
||||
|
||||
return torch.from_numpy(img), torch.from_numpy(img_resized)
|
||||
|
||||
def _getitem_yolov7(self, idx):
|
||||
|
||||
img = cv2.imread(osp.join(self.seq_path, self.imgs_in_seq[idx]))
|
||||
|
||||
img_resized = self._preprocess_yolov7(img, ) # torch.Tensor
|
||||
|
||||
return torch.from_numpy(img), img_resized
|
||||
|
||||
def _getitem_yolov8(self, idx):
|
||||
|
||||
img = cv2.imread(osp.join(self.seq_path, self.imgs_in_seq[idx])) # (h, w, c)
|
||||
# img = self._preprocess_yolov8(img)
|
||||
|
||||
return torch.from_numpy(img), torch.from_numpy(img)
|
||||
|
||||
|
||||
def _preprocess_yolox(self, img, size, swap=(2, 0, 1)):
|
||||
""" convert origin image to resized image, YOLOX-manner
|
||||
|
||||
Args:
|
||||
img: np.ndarray
|
||||
size: List[int, int] | Tuple[int, int]
|
||||
swap: (H, W, C) -> (C, H, W)
|
||||
|
||||
Returns:
|
||||
np.ndarray, float
|
||||
|
||||
"""
|
||||
if len(img.shape) == 3:
|
||||
padded_img = np.ones((size[0], size[1], 3), dtype=np.uint8) * 114
|
||||
else:
|
||||
padded_img = np.ones(size, dtype=np.uint8) * 114
|
||||
|
||||
r = min(size[0] / img.shape[0], size[1] / img.shape[1])
|
||||
resized_img = cv2.resize(
|
||||
img,
|
||||
(int(img.shape[1] * r), int(img.shape[0] * r)),
|
||||
interpolation=cv2.INTER_LINEAR,
|
||||
).astype(np.uint8)
|
||||
padded_img[: int(img.shape[0] * r), : int(img.shape[1] * r)] = resized_img
|
||||
|
||||
padded_img = padded_img.transpose(swap)
|
||||
padded_img = np.ascontiguousarray(padded_img, dtype=np.float32)
|
||||
return padded_img, r
|
||||
|
||||
def _preprocess_yolov7(self, img, ):
|
||||
|
||||
img_resized = self._letterbox(img, new_shape=self.img_size, stride=self.other_param['stride'], )[0]
|
||||
img_resized = img_resized[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB
|
||||
img_resized = np.ascontiguousarray(img_resized)
|
||||
|
||||
img_resized = torch.from_numpy(img_resized).float()
|
||||
img_resized /= 255.0
|
||||
|
||||
return img_resized
|
||||
|
||||
def _preprocess_yolov8(self, img, ):
|
||||
|
||||
img = img.transpose((2, 0, 1))
|
||||
img = np.ascontiguousarray(img)
|
||||
|
||||
return img
|
||||
|
||||
|
||||
def _letterbox(self, img, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
|
||||
# Resize and pad image while meeting stride-multiple constraints
|
||||
shape = img.shape[:2] # current shape [height, width]
|
||||
if isinstance(new_shape, int):
|
||||
new_shape = (new_shape, new_shape)
|
||||
|
||||
# Scale ratio (new / old)
|
||||
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
|
||||
if not scaleup: # only scale down, do not scale up (for better test mAP)
|
||||
r = min(r, 1.0)
|
||||
|
||||
# Compute padding
|
||||
ratio = r, r # width, height ratios
|
||||
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
|
||||
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
|
||||
if auto: # minimum rectangle
|
||||
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
|
||||
elif scaleFill: # stretch
|
||||
dw, dh = 0.0, 0.0
|
||||
new_unpad = (new_shape[1], new_shape[0])
|
||||
ratio = new_shape[1] / shape[1], new_shape[0] / shape[0] # width, height ratios
|
||||
|
||||
dw /= 2 # divide padding into 2 sides
|
||||
dh /= 2
|
||||
|
||||
if shape[::-1] != new_unpad: # resize
|
||||
img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
|
||||
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
|
||||
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
|
||||
img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
|
||||
return img, ratio, (dw, dh)
|
||||
|
||||
def __len__(self, ):
|
||||
return len(self.imgs_in_seq)
|
||||
|
||||
|
||||
class DemoDataset(TestDataset):
|
||||
"""
|
||||
dataset for demo
|
||||
"""
|
||||
def __init__(self, file_name, img_size=[640, 640], model='yolox', legacy_yolox=True, **kwargs) -> None:
|
||||
|
||||
self.file_name = file_name
|
||||
self.model = model
|
||||
self.img_size = img_size
|
||||
|
||||
self.is_video = '.mp4' in file_name or '.avi' in file_name
|
||||
|
||||
if not self.is_video:
|
||||
self.imgs_in_seq = sorted(os.listdir(file_name))
|
||||
else:
|
||||
self.imgs_in_seq = []
|
||||
self.cap = cv2.VideoCapture(file_name)
|
||||
|
||||
while True:
|
||||
ret, frame = self.cap.read()
|
||||
if not ret: break
|
||||
|
||||
self.imgs_in_seq.append(frame)
|
||||
|
||||
self.legacy = legacy_yolox
|
||||
|
||||
def __getitem__(self, idx):
|
||||
|
||||
if not self.is_video:
|
||||
img = cv2.imread(osp.join(self.file_name, self.imgs_in_seq[idx]))
|
||||
else:
|
||||
img = self.imgs_in_seq[idx]
|
||||
|
||||
if self.model == 'yolox':
|
||||
return self._getitem_yolox(img)
|
||||
elif self.model == 'yolov7':
|
||||
return self._getitem_yolov7(img)
|
||||
elif self.model == 'yolov8':
|
||||
return self._getitem_yolov8(img)
|
||||
|
||||
def _getitem_yolox(self, img):
|
||||
|
||||
img_resized, _ = self._preprocess_yolox(img, self.img_size, )
|
||||
if self.legacy:
|
||||
img_resized = img_resized[::-1, :, :].copy() # BGR -> RGB
|
||||
img_resized /= 255.0
|
||||
img_resized -= np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1)
|
||||
img_resized /= np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1)
|
||||
|
||||
return torch.from_numpy(img), torch.from_numpy(img_resized)
|
||||
|
||||
def _getitem_yolov7(self, img):
|
||||
|
||||
img_resized = self._preprocess_yolov7(img, ) # torch.Tensor
|
||||
|
||||
return torch.from_numpy(img), img_resized
|
||||
|
||||
def _getitem_yolov8(self, img):
|
||||
|
||||
# img = self._preprocess_yolov8(img)
|
||||
|
||||
return torch.from_numpy(img), torch.from_numpy(img)
|
133
yolov7-tracker-example/tracker/trackers/basetrack.py
Normal file
133
yolov7-tracker-example/tracker/trackers/basetrack.py
Normal file
@ -0,0 +1,133 @@
|
||||
import numpy as np
|
||||
from collections import OrderedDict
|
||||
|
||||
|
||||
class TrackState(object):
|
||||
New = 0
|
||||
Tracked = 1
|
||||
Lost = 2
|
||||
Removed = 3
|
||||
|
||||
|
||||
class BaseTrack(object):
|
||||
_count = 0
|
||||
|
||||
track_id = 0
|
||||
is_activated = False
|
||||
state = TrackState.New
|
||||
|
||||
history = OrderedDict()
|
||||
features = []
|
||||
curr_feature = None
|
||||
score = 0
|
||||
start_frame = 0
|
||||
frame_id = 0
|
||||
time_since_update = 0
|
||||
|
||||
# multi-camera
|
||||
location = (np.inf, np.inf)
|
||||
|
||||
@property
|
||||
def end_frame(self):
|
||||
return self.frame_id
|
||||
|
||||
@staticmethod
|
||||
def next_id():
|
||||
BaseTrack._count += 1
|
||||
return BaseTrack._count
|
||||
|
||||
def activate(self, *args):
|
||||
raise NotImplementedError
|
||||
|
||||
def predict(self):
|
||||
raise NotImplementedError
|
||||
|
||||
def update(self, *args, **kwargs):
|
||||
raise NotImplementedError
|
||||
|
||||
def mark_lost(self):
|
||||
self.state = TrackState.Lost
|
||||
|
||||
def mark_removed(self):
|
||||
self.state = TrackState.Removed
|
||||
|
||||
@property
|
||||
def tlwh(self):
|
||||
"""Get current position in bounding box format `(top left x, top left y,
|
||||
width, height)`.
|
||||
"""
|
||||
if self.mean is None:
|
||||
return self._tlwh.copy()
|
||||
ret = self.mean[:4].copy()
|
||||
ret[:2] -= ret[2:] / 2
|
||||
return ret
|
||||
|
||||
@property
|
||||
def tlbr(self):
|
||||
"""Convert bounding box to format `(min x, min y, max x, max y)`, i.e.,
|
||||
`(top left, bottom right)`.
|
||||
"""
|
||||
ret = self.tlwh.copy()
|
||||
ret[2:] += ret[:2]
|
||||
return ret
|
||||
@property
|
||||
def xywh(self):
|
||||
"""Convert bounding box to format `(min x, min y, max x, max y)`, i.e.,
|
||||
`(top left, bottom right)`.
|
||||
"""
|
||||
ret = self.tlwh.copy()
|
||||
ret[:2] += ret[2:] / 2.0
|
||||
return ret
|
||||
|
||||
@staticmethod
|
||||
# @jit(nopython=True)
|
||||
def tlwh_to_xyah(tlwh):
|
||||
"""Convert bounding box to format `(center x, center y, aspect ratio,
|
||||
height)`, where the aspect ratio is `width / height`.
|
||||
"""
|
||||
ret = np.asarray(tlwh).copy()
|
||||
ret[:2] += ret[2:] / 2
|
||||
ret[2] /= ret[3]
|
||||
return ret
|
||||
|
||||
@staticmethod
|
||||
def tlwh_to_xywh(tlwh):
|
||||
"""Convert bounding box to format `(center x, center y, width,
|
||||
height)`.
|
||||
"""
|
||||
ret = np.asarray(tlwh).copy()
|
||||
ret[:2] += ret[2:] / 2
|
||||
return ret
|
||||
|
||||
@staticmethod
|
||||
def tlwh_to_xysa(tlwh):
|
||||
"""Convert bounding box to format `(center x, center y, width,
|
||||
height)`.
|
||||
"""
|
||||
ret = np.asarray(tlwh).copy()
|
||||
ret[:2] += ret[2:] / 2
|
||||
ret[2] = tlwh[2] * tlwh[3]
|
||||
ret[3] = tlwh[2] / tlwh[3]
|
||||
return ret
|
||||
|
||||
def to_xyah(self):
|
||||
return self.tlwh_to_xyah(self.tlwh)
|
||||
|
||||
def to_xywh(self):
|
||||
return self.tlwh_to_xywh(self.tlwh)
|
||||
|
||||
@staticmethod
|
||||
def tlbr_to_tlwh(tlbr):
|
||||
ret = np.asarray(tlbr).copy()
|
||||
ret[2:] -= ret[:2]
|
||||
return ret
|
||||
|
||||
@staticmethod
|
||||
# @jit(nopython=True)
|
||||
def tlwh_to_tlbr(tlwh):
|
||||
ret = np.asarray(tlwh).copy()
|
||||
ret[2:] += ret[:2]
|
||||
return ret
|
||||
|
||||
def __repr__(self):
|
||||
return 'OT_{}_({}-{})'.format(self.track_id, self.start_frame, self.end_frame)
|
329
yolov7-tracker-example/tracker/trackers/botsort_tracker.py
Normal file
329
yolov7-tracker-example/tracker/trackers/botsort_tracker.py
Normal file
@ -0,0 +1,329 @@
|
||||
"""
|
||||
Bot sort
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
from torchvision.ops import nms
|
||||
|
||||
import cv2
|
||||
import torchvision.transforms as T
|
||||
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .tracklet import Tracklet, Tracklet_w_reid
|
||||
from .matching import *
|
||||
|
||||
from .reid_models.OSNet import *
|
||||
from .reid_models.load_model_tools import load_pretrained_weights
|
||||
from .reid_models.deepsort_reid import Extractor
|
||||
|
||||
from .camera_motion_compensation import GMC
|
||||
|
||||
REID_MODEL_DICT = {
|
||||
'osnet_x1_0': osnet_x1_0,
|
||||
'osnet_x0_75': osnet_x0_75,
|
||||
'osnet_x0_5': osnet_x0_5,
|
||||
'osnet_x0_25': osnet_x0_25,
|
||||
'deepsort': Extractor
|
||||
}
|
||||
|
||||
|
||||
def load_reid_model(reid_model, reid_model_path):
|
||||
|
||||
if 'osnet' in reid_model:
|
||||
func = REID_MODEL_DICT[reid_model]
|
||||
model = func(num_classes=1, pretrained=False, )
|
||||
load_pretrained_weights(model, reid_model_path)
|
||||
model.cuda().eval()
|
||||
|
||||
elif 'deepsort' in reid_model:
|
||||
model = REID_MODEL_DICT[reid_model](reid_model_path, use_cuda=True)
|
||||
|
||||
else:
|
||||
raise NotImplementedError
|
||||
|
||||
return model
|
||||
|
||||
class BotTracker(object):
|
||||
def __init__(self, args, frame_rate=30):
|
||||
self.tracked_tracklets = [] # type: list[Tracklet]
|
||||
self.lost_tracklets = [] # type: list[Tracklet]
|
||||
self.removed_tracklets = [] # type: list[Tracklet]
|
||||
|
||||
self.frame_id = 0
|
||||
self.args = args
|
||||
|
||||
self.det_thresh = args.conf_thresh + 0.1
|
||||
self.buffer_size = int(frame_rate / 30.0 * args.track_buffer)
|
||||
self.max_time_lost = self.buffer_size
|
||||
|
||||
self.motion = args.kalman_format
|
||||
|
||||
self.with_reid = not args.discard_reid
|
||||
|
||||
self.reid_model, self.crop_transforms = None, None
|
||||
if self.with_reid:
|
||||
self.reid_model = load_reid_model(args.reid_model, args.reid_model_path)
|
||||
self.crop_transforms = T.Compose([
|
||||
# T.ToPILImage(),
|
||||
# T.Resize(size=(256, 128)),
|
||||
T.ToTensor(), # (c, 128, 256)
|
||||
T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
|
||||
])
|
||||
|
||||
|
||||
# camera motion compensation module
|
||||
self.gmc = GMC(method='orb', downscale=2, verbose=None)
|
||||
|
||||
def reid_preprocess(self, obj_bbox):
|
||||
"""
|
||||
preprocess cropped object bboxes
|
||||
|
||||
obj_bbox: np.ndarray, shape=(h_obj, w_obj, c)
|
||||
|
||||
return:
|
||||
torch.Tensor of shape (c, 128, 256)
|
||||
"""
|
||||
obj_bbox = cv2.resize(obj_bbox.astype(np.float32) / 255.0, dsize=(128, 128)) # shape: (128, 256, c)
|
||||
|
||||
return self.crop_transforms(obj_bbox)
|
||||
|
||||
def get_feature(self, tlwhs, ori_img):
|
||||
"""
|
||||
get apperance feature of an object
|
||||
tlwhs: shape (num_of_objects, 4)
|
||||
ori_img: original image, np.ndarray, shape(H, W, C)
|
||||
"""
|
||||
obj_bbox = []
|
||||
|
||||
for tlwh in tlwhs:
|
||||
tlwh = list(map(int, tlwh))
|
||||
# if any(tlbr_ == -1 for tlbr_ in tlwh):
|
||||
# print(tlwh)
|
||||
|
||||
tlbr_tensor = self.reid_preprocess(ori_img[tlwh[1]: tlwh[1] + tlwh[3], tlwh[0]: tlwh[0] + tlwh[2]])
|
||||
obj_bbox.append(tlbr_tensor)
|
||||
|
||||
if not obj_bbox:
|
||||
return np.array([])
|
||||
|
||||
obj_bbox = torch.stack(obj_bbox, dim=0)
|
||||
obj_bbox = obj_bbox.cuda()
|
||||
|
||||
features = self.reid_model(obj_bbox) # shape: (num_of_objects, feature_dim)
|
||||
return features.cpu().detach().numpy()
|
||||
|
||||
|
||||
def update(self, output_results, img, ori_img):
|
||||
"""
|
||||
output_results: processed detections (scale to original size) tlwh format
|
||||
"""
|
||||
|
||||
self.frame_id += 1
|
||||
activated_tracklets = []
|
||||
refind_tracklets = []
|
||||
lost_tracklets = []
|
||||
removed_tracklets = []
|
||||
|
||||
scores = output_results[:, 4]
|
||||
bboxes = output_results[:, :4]
|
||||
categories = output_results[:, -1]
|
||||
|
||||
remain_inds = scores > self.args.conf_thresh
|
||||
inds_low = scores > 0.1
|
||||
inds_high = scores < self.args.conf_thresh
|
||||
|
||||
inds_second = np.logical_and(inds_low, inds_high)
|
||||
dets_second = bboxes[inds_second]
|
||||
dets = bboxes[remain_inds]
|
||||
|
||||
cates = categories[remain_inds]
|
||||
cates_second = categories[inds_second]
|
||||
|
||||
scores_keep = scores[remain_inds]
|
||||
scores_second = scores[inds_second]
|
||||
|
||||
"""Step 1: Extract reid features"""
|
||||
if self.with_reid:
|
||||
features_keep = self.get_feature(tlwhs=dets[:, :4], ori_img=ori_img)
|
||||
|
||||
if len(dets) > 0:
|
||||
if self.with_reid:
|
||||
detections = [Tracklet_w_reid(tlwh, s, cate, motion=self.motion, feat=feat) for
|
||||
(tlwh, s, cate, feat) in zip(dets, scores_keep, cates, features_keep)]
|
||||
else:
|
||||
detections = [Tracklet(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets, scores_keep, cates)]
|
||||
else:
|
||||
detections = []
|
||||
|
||||
''' Add newly detected tracklets to tracked_tracklets'''
|
||||
unconfirmed = []
|
||||
tracked_tracklets = [] # type: list[Tracklet]
|
||||
for track in self.tracked_tracklets:
|
||||
if not track.is_activated:
|
||||
unconfirmed.append(track)
|
||||
else:
|
||||
tracked_tracklets.append(track)
|
||||
|
||||
''' Step 2: First association, with high score detection boxes'''
|
||||
tracklet_pool = joint_tracklets(tracked_tracklets, self.lost_tracklets)
|
||||
|
||||
# Predict the current location with Kalman
|
||||
for tracklet in tracklet_pool:
|
||||
tracklet.predict()
|
||||
|
||||
# Camera motion compensation
|
||||
warp = self.gmc.apply(ori_img, dets)
|
||||
self.gmc.multi_gmc(tracklet_pool, warp)
|
||||
self.gmc.multi_gmc(unconfirmed, warp)
|
||||
|
||||
ious_dists = iou_distance(tracklet_pool, detections)
|
||||
ious_dists_mask = (ious_dists > 0.5) # high conf iou
|
||||
|
||||
if self.with_reid:
|
||||
# mixed cost matrix
|
||||
emb_dists = embedding_distance(tracklet_pool, detections) / 2.0
|
||||
raw_emb_dists = emb_dists.copy()
|
||||
emb_dists[emb_dists > 0.25] = 1.0
|
||||
emb_dists[ious_dists_mask] = 1.0
|
||||
dists = np.minimum(ious_dists, emb_dists)
|
||||
|
||||
else:
|
||||
dists = ious_dists
|
||||
|
||||
matches, u_track, u_detection = linear_assignment(dists, thresh=0.9)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_pool[itracked]
|
||||
det = detections[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
''' Step 3: Second association, with low score detection boxes'''
|
||||
# association the untrack to the low score detections
|
||||
if len(dets_second) > 0:
|
||||
'''Detections'''
|
||||
detections_second = [Tracklet(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets_second, scores_second, cates_second)]
|
||||
else:
|
||||
detections_second = []
|
||||
|
||||
r_tracked_tracklets = [tracklet_pool[i] for i in u_track if tracklet_pool[i].state == TrackState.Tracked]
|
||||
dists = iou_distance(r_tracked_tracklets, detections_second)
|
||||
matches, u_track, u_detection_second = linear_assignment(dists, thresh=0.5)
|
||||
for itracked, idet in matches:
|
||||
track = r_tracked_tracklets[itracked]
|
||||
det = detections_second[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(det, self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
for it in u_track:
|
||||
track = r_tracked_tracklets[it]
|
||||
if not track.state == TrackState.Lost:
|
||||
track.mark_lost()
|
||||
lost_tracklets.append(track)
|
||||
|
||||
|
||||
'''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
|
||||
detections = [detections[i] for i in u_detection]
|
||||
ious_dists = iou_distance(unconfirmed, detections)
|
||||
ious_dists_mask = (ious_dists > 0.5)
|
||||
|
||||
if self.with_reid:
|
||||
emb_dists = embedding_distance(unconfirmed, detections) / 2.0
|
||||
raw_emb_dists = emb_dists.copy()
|
||||
emb_dists[emb_dists > 0.25] = 1.0
|
||||
emb_dists[ious_dists_mask] = 1.0
|
||||
dists = np.minimum(ious_dists, emb_dists)
|
||||
else:
|
||||
dists = ious_dists
|
||||
|
||||
|
||||
matches, u_unconfirmed, u_detection = linear_assignment(dists, thresh=0.7)
|
||||
|
||||
for itracked, idet in matches:
|
||||
unconfirmed[itracked].update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(unconfirmed[itracked])
|
||||
for it in u_unconfirmed:
|
||||
track = unconfirmed[it]
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
""" Step 4: Init new tracklets"""
|
||||
for inew in u_detection:
|
||||
track = detections[inew]
|
||||
if track.score < self.det_thresh:
|
||||
continue
|
||||
track.activate(self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
|
||||
""" Step 5: Update state"""
|
||||
for track in self.lost_tracklets:
|
||||
if self.frame_id - track.end_frame > self.max_time_lost:
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
# print('Ramained match {} s'.format(t4-t3))
|
||||
|
||||
self.tracked_tracklets = [t for t in self.tracked_tracklets if t.state == TrackState.Tracked]
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, activated_tracklets)
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, refind_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.tracked_tracklets)
|
||||
self.lost_tracklets.extend(lost_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.removed_tracklets)
|
||||
self.removed_tracklets.extend(removed_tracklets)
|
||||
self.tracked_tracklets, self.lost_tracklets = remove_duplicate_tracklets(self.tracked_tracklets, self.lost_tracklets)
|
||||
# get scores of lost tracks
|
||||
output_tracklets = [track for track in self.tracked_tracklets if track.is_activated]
|
||||
|
||||
return output_tracklets
|
||||
|
||||
|
||||
def joint_tracklets(tlista, tlistb):
|
||||
exists = {}
|
||||
res = []
|
||||
for t in tlista:
|
||||
exists[t.track_id] = 1
|
||||
res.append(t)
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if not exists.get(tid, 0):
|
||||
exists[tid] = 1
|
||||
res.append(t)
|
||||
return res
|
||||
|
||||
|
||||
def sub_tracklets(tlista, tlistb):
|
||||
tracklets = {}
|
||||
for t in tlista:
|
||||
tracklets[t.track_id] = t
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if tracklets.get(tid, 0):
|
||||
del tracklets[tid]
|
||||
return list(tracklets.values())
|
||||
|
||||
|
||||
def remove_duplicate_tracklets(trackletsa, trackletsb):
|
||||
pdist = iou_distance(trackletsa, trackletsb)
|
||||
pairs = np.where(pdist < 0.15)
|
||||
dupa, dupb = list(), list()
|
||||
for p, q in zip(*pairs):
|
||||
timep = trackletsa[p].frame_id - trackletsa[p].start_frame
|
||||
timeq = trackletsb[q].frame_id - trackletsb[q].start_frame
|
||||
if timep > timeq:
|
||||
dupb.append(q)
|
||||
else:
|
||||
dupa.append(p)
|
||||
resa = [t for i, t in enumerate(trackletsa) if not i in dupa]
|
||||
resb = [t for i, t in enumerate(trackletsb) if not i in dupb]
|
||||
return resa, resb
|
201
yolov7-tracker-example/tracker/trackers/byte_tracker.py
Normal file
201
yolov7-tracker-example/tracker/trackers/byte_tracker.py
Normal file
@ -0,0 +1,201 @@
|
||||
"""
|
||||
ByteTrack
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
from collections import deque
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .tracklet import Tracklet
|
||||
from .matching import *
|
||||
|
||||
class ByteTracker(object):
|
||||
def __init__(self, args, frame_rate=30):
|
||||
self.tracked_tracklets = [] # type: list[Tracklet]
|
||||
self.lost_tracklets = [] # type: list[Tracklet]
|
||||
self.removed_tracklets = [] # type: list[Tracklet]
|
||||
|
||||
self.frame_id = 0
|
||||
self.args = args
|
||||
|
||||
self.det_thresh = args.conf_thresh + 0.1
|
||||
self.buffer_size = int(frame_rate / 30.0 * args.track_buffer)
|
||||
self.max_time_lost = self.buffer_size
|
||||
|
||||
self.motion = args.kalman_format
|
||||
|
||||
def update(self, output_results, img, ori_img):
|
||||
"""
|
||||
output_results: processed detections (scale to original size) tlbr format
|
||||
"""
|
||||
|
||||
self.frame_id += 1
|
||||
activated_tracklets = []
|
||||
refind_tracklets = []
|
||||
lost_tracklets = []
|
||||
removed_tracklets = []
|
||||
|
||||
scores = output_results[:, 4]
|
||||
bboxes = output_results[:, :4]
|
||||
categories = output_results[:, -1]
|
||||
|
||||
remain_inds = scores > self.args.conf_thresh
|
||||
inds_low = scores > 0.1
|
||||
inds_high = scores < self.args.conf_thresh
|
||||
|
||||
inds_second = np.logical_and(inds_low, inds_high)
|
||||
dets_second = bboxes[inds_second]
|
||||
dets = bboxes[remain_inds]
|
||||
|
||||
cates = categories[remain_inds]
|
||||
cates_second = categories[inds_second]
|
||||
|
||||
scores_keep = scores[remain_inds]
|
||||
scores_second = scores[inds_second]
|
||||
|
||||
if len(dets) > 0:
|
||||
'''Detections'''
|
||||
detections = [Tracklet(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets, scores_keep, cates)]
|
||||
else:
|
||||
detections = []
|
||||
|
||||
''' Add newly detected tracklets to tracked_tracklets'''
|
||||
unconfirmed = []
|
||||
tracked_tracklets = [] # type: list[Tracklet]
|
||||
for track in self.tracked_tracklets:
|
||||
if not track.is_activated:
|
||||
unconfirmed.append(track)
|
||||
else:
|
||||
tracked_tracklets.append(track)
|
||||
|
||||
''' Step 2: First association, with high score detection boxes'''
|
||||
tracklet_pool = joint_tracklets(tracked_tracklets, self.lost_tracklets)
|
||||
|
||||
# Predict the current location with Kalman
|
||||
for tracklet in tracklet_pool:
|
||||
tracklet.predict()
|
||||
|
||||
dists = iou_distance(tracklet_pool, detections)
|
||||
|
||||
matches, u_track, u_detection = linear_assignment(dists, thresh=0.9)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_pool[itracked]
|
||||
det = detections[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
''' Step 3: Second association, with low score detection boxes'''
|
||||
# association the untrack to the low score detections
|
||||
if len(dets_second) > 0:
|
||||
'''Detections'''
|
||||
detections_second = [Tracklet(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets_second, scores_second, cates_second)]
|
||||
else:
|
||||
detections_second = []
|
||||
r_tracked_tracklets = [tracklet_pool[i] for i in u_track if tracklet_pool[i].state == TrackState.Tracked]
|
||||
dists = iou_distance(r_tracked_tracklets, detections_second)
|
||||
matches, u_track, u_detection_second = linear_assignment(dists, thresh=0.5)
|
||||
for itracked, idet in matches:
|
||||
track = r_tracked_tracklets[itracked]
|
||||
det = detections_second[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(det, self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
for it in u_track:
|
||||
track = r_tracked_tracklets[it]
|
||||
if not track.state == TrackState.Lost:
|
||||
track.mark_lost()
|
||||
lost_tracklets.append(track)
|
||||
|
||||
'''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
|
||||
detections = [detections[i] for i in u_detection]
|
||||
dists = iou_distance(unconfirmed, detections)
|
||||
|
||||
matches, u_unconfirmed, u_detection = linear_assignment(dists, thresh=0.7)
|
||||
|
||||
for itracked, idet in matches:
|
||||
unconfirmed[itracked].update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(unconfirmed[itracked])
|
||||
for it in u_unconfirmed:
|
||||
track = unconfirmed[it]
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
""" Step 4: Init new tracklets"""
|
||||
for inew in u_detection:
|
||||
track = detections[inew]
|
||||
if track.score < self.det_thresh:
|
||||
continue
|
||||
track.activate(self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
|
||||
""" Step 5: Update state"""
|
||||
for track in self.lost_tracklets:
|
||||
if self.frame_id - track.end_frame > self.max_time_lost:
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
# print('Ramained match {} s'.format(t4-t3))
|
||||
|
||||
self.tracked_tracklets = [t for t in self.tracked_tracklets if t.state == TrackState.Tracked]
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, activated_tracklets)
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, refind_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.tracked_tracklets)
|
||||
self.lost_tracklets.extend(lost_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.removed_tracklets)
|
||||
self.removed_tracklets.extend(removed_tracklets)
|
||||
self.tracked_tracklets, self.lost_tracklets = remove_duplicate_tracklets(self.tracked_tracklets, self.lost_tracklets)
|
||||
# get scores of lost tracks
|
||||
output_tracklets = [track for track in self.tracked_tracklets if track.is_activated]
|
||||
|
||||
return output_tracklets
|
||||
|
||||
|
||||
def joint_tracklets(tlista, tlistb):
|
||||
exists = {}
|
||||
res = []
|
||||
for t in tlista:
|
||||
exists[t.track_id] = 1
|
||||
res.append(t)
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if not exists.get(tid, 0):
|
||||
exists[tid] = 1
|
||||
res.append(t)
|
||||
return res
|
||||
|
||||
|
||||
def sub_tracklets(tlista, tlistb):
|
||||
tracklets = {}
|
||||
for t in tlista:
|
||||
tracklets[t.track_id] = t
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if tracklets.get(tid, 0):
|
||||
del tracklets[tid]
|
||||
return list(tracklets.values())
|
||||
|
||||
|
||||
def remove_duplicate_tracklets(trackletsa, trackletsb):
|
||||
pdist = iou_distance(trackletsa, trackletsb)
|
||||
pairs = np.where(pdist < 0.15)
|
||||
dupa, dupb = list(), list()
|
||||
for p, q in zip(*pairs):
|
||||
timep = trackletsa[p].frame_id - trackletsa[p].start_frame
|
||||
timeq = trackletsb[q].frame_id - trackletsb[q].start_frame
|
||||
if timep > timeq:
|
||||
dupb.append(q)
|
||||
else:
|
||||
dupa.append(p)
|
||||
resa = [t for i, t in enumerate(trackletsa) if not i in dupa]
|
||||
resb = [t for i, t in enumerate(trackletsb) if not i in dupb]
|
||||
return resa, resb
|
204
yolov7-tracker-example/tracker/trackers/c_biou_tracker.py
Normal file
204
yolov7-tracker-example/tracker/trackers/c_biou_tracker.py
Normal file
@ -0,0 +1,204 @@
|
||||
"""
|
||||
C_BIoU Track
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
from collections import deque
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .tracklet import Tracklet, Tracklet_w_bbox_buffer
|
||||
from .matching import *
|
||||
|
||||
class C_BIoUTracker(object):
|
||||
def __init__(self, args, frame_rate=30):
|
||||
self.tracked_tracklets = [] # type: list[Tracklet]
|
||||
self.lost_tracklets = [] # type: list[Tracklet]
|
||||
self.removed_tracklets = [] # type: list[Tracklet]
|
||||
|
||||
self.frame_id = 0
|
||||
self.args = args
|
||||
|
||||
self.det_thresh = args.conf_thresh + 0.1
|
||||
self.buffer_size = int(frame_rate / 30.0 * args.track_buffer)
|
||||
self.max_time_lost = self.buffer_size
|
||||
|
||||
self.motion = args.kalman_format
|
||||
|
||||
def update(self, output_results, img, ori_img):
|
||||
"""
|
||||
output_results: processed detections (scale to original size) tlbr format
|
||||
"""
|
||||
|
||||
self.frame_id += 1
|
||||
activated_tracklets = []
|
||||
refind_tracklets = []
|
||||
lost_tracklets = []
|
||||
removed_tracklets = []
|
||||
|
||||
scores = output_results[:, 4]
|
||||
bboxes = output_results[:, :4]
|
||||
categories = output_results[:, -1]
|
||||
|
||||
remain_inds = scores > self.args.conf_thresh
|
||||
inds_low = scores > 0.1
|
||||
inds_high = scores < self.args.conf_thresh
|
||||
|
||||
inds_second = np.logical_and(inds_low, inds_high)
|
||||
dets_second = bboxes[inds_second]
|
||||
dets = bboxes[remain_inds]
|
||||
|
||||
cates = categories[remain_inds]
|
||||
cates_second = categories[inds_second]
|
||||
|
||||
scores_keep = scores[remain_inds]
|
||||
scores_second = scores[inds_second]
|
||||
|
||||
if len(dets) > 0:
|
||||
'''Detections'''
|
||||
detections = [Tracklet_w_bbox_buffer(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets, scores_keep, cates)]
|
||||
else:
|
||||
detections = []
|
||||
|
||||
''' Add newly detected tracklets to tracked_tracklets'''
|
||||
unconfirmed = []
|
||||
tracked_tracklets = [] # type: list[Tracklet]
|
||||
for track in self.tracked_tracklets:
|
||||
if not track.is_activated:
|
||||
unconfirmed.append(track)
|
||||
else:
|
||||
tracked_tracklets.append(track)
|
||||
|
||||
''' Step 2: First association, with high score detection boxes'''
|
||||
tracklet_pool = joint_tracklets(tracked_tracklets, self.lost_tracklets)
|
||||
|
||||
# Predict the current location with Kalman
|
||||
for tracklet in tracklet_pool:
|
||||
tracklet.predict()
|
||||
|
||||
dists = buffered_iou_distance(tracklet_pool, detections, level=1)
|
||||
|
||||
matches, u_track, u_detection = linear_assignment(dists, thresh=0.9)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_pool[itracked]
|
||||
det = detections[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
''' Step 3: Second association, with low score detection boxes'''
|
||||
# association the untrack to the low score detections
|
||||
if len(dets_second) > 0:
|
||||
'''Detections'''
|
||||
detections_second = [Tracklet_w_bbox_buffer(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets_second, scores_second, cates_second)]
|
||||
else:
|
||||
detections_second = []
|
||||
r_tracked_tracklets = [tracklet_pool[i] for i in u_track if tracklet_pool[i].state == TrackState.Tracked]
|
||||
|
||||
|
||||
dists = buffered_iou_distance(r_tracked_tracklets, detections_second, level=2)
|
||||
|
||||
matches, u_track, u_detection_second = linear_assignment(dists, thresh=0.5)
|
||||
for itracked, idet in matches:
|
||||
track = r_tracked_tracklets[itracked]
|
||||
det = detections_second[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(det, self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
for it in u_track:
|
||||
track = r_tracked_tracklets[it]
|
||||
if not track.state == TrackState.Lost:
|
||||
track.mark_lost()
|
||||
lost_tracklets.append(track)
|
||||
|
||||
'''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
|
||||
detections = [detections[i] for i in u_detection]
|
||||
dists = buffered_iou_distance(unconfirmed, detections, level=1)
|
||||
|
||||
matches, u_unconfirmed, u_detection = linear_assignment(dists, thresh=0.7)
|
||||
|
||||
for itracked, idet in matches:
|
||||
unconfirmed[itracked].update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(unconfirmed[itracked])
|
||||
for it in u_unconfirmed:
|
||||
track = unconfirmed[it]
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
""" Step 4: Init new tracklets"""
|
||||
for inew in u_detection:
|
||||
track = detections[inew]
|
||||
if track.score < self.det_thresh:
|
||||
continue
|
||||
track.activate(self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
|
||||
""" Step 5: Update state"""
|
||||
for track in self.lost_tracklets:
|
||||
if self.frame_id - track.end_frame > self.max_time_lost:
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
# print('Ramained match {} s'.format(t4-t3))
|
||||
|
||||
self.tracked_tracklets = [t for t in self.tracked_tracklets if t.state == TrackState.Tracked]
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, activated_tracklets)
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, refind_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.tracked_tracklets)
|
||||
self.lost_tracklets.extend(lost_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.removed_tracklets)
|
||||
self.removed_tracklets.extend(removed_tracklets)
|
||||
self.tracked_tracklets, self.lost_tracklets = remove_duplicate_tracklets(self.tracked_tracklets, self.lost_tracklets)
|
||||
# get scores of lost tracks
|
||||
output_tracklets = [track for track in self.tracked_tracklets if track.is_activated]
|
||||
|
||||
return output_tracklets
|
||||
|
||||
|
||||
def joint_tracklets(tlista, tlistb):
|
||||
exists = {}
|
||||
res = []
|
||||
for t in tlista:
|
||||
exists[t.track_id] = 1
|
||||
res.append(t)
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if not exists.get(tid, 0):
|
||||
exists[tid] = 1
|
||||
res.append(t)
|
||||
return res
|
||||
|
||||
|
||||
def sub_tracklets(tlista, tlistb):
|
||||
tracklets = {}
|
||||
for t in tlista:
|
||||
tracklets[t.track_id] = t
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if tracklets.get(tid, 0):
|
||||
del tracklets[tid]
|
||||
return list(tracklets.values())
|
||||
|
||||
|
||||
def remove_duplicate_tracklets(trackletsa, trackletsb):
|
||||
pdist = iou_distance(trackletsa, trackletsb)
|
||||
pairs = np.where(pdist < 0.15)
|
||||
dupa, dupb = list(), list()
|
||||
for p, q in zip(*pairs):
|
||||
timep = trackletsa[p].frame_id - trackletsa[p].start_frame
|
||||
timeq = trackletsb[q].frame_id - trackletsb[q].start_frame
|
||||
if timep > timeq:
|
||||
dupb.append(q)
|
||||
else:
|
||||
dupa.append(p)
|
||||
resa = [t for i, t in enumerate(trackletsa) if not i in dupa]
|
||||
resb = [t for i, t in enumerate(trackletsb) if not i in dupb]
|
||||
return resa, resb
|
@ -0,0 +1,264 @@
|
||||
import cv2
|
||||
import numpy as np
|
||||
import copy
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
"""GMC Module"""
|
||||
class GMC:
|
||||
def __init__(self, method='orb', downscale=2, verbose=None):
|
||||
super(GMC, self).__init__()
|
||||
|
||||
self.method = method
|
||||
self.downscale = max(1, int(downscale))
|
||||
|
||||
if self.method == 'orb':
|
||||
self.detector = cv2.FastFeatureDetector_create(20)
|
||||
self.extractor = cv2.ORB_create()
|
||||
self.matcher = cv2.BFMatcher(cv2.NORM_HAMMING)
|
||||
|
||||
elif self.method == 'sift':
|
||||
self.detector = cv2.SIFT_create(nOctaveLayers=3, contrastThreshold=0.02, edgeThreshold=20)
|
||||
self.extractor = cv2.SIFT_create(nOctaveLayers=3, contrastThreshold=0.02, edgeThreshold=20)
|
||||
self.matcher = cv2.BFMatcher(cv2.NORM_L2)
|
||||
|
||||
elif self.method == 'ecc':
|
||||
number_of_iterations = 100
|
||||
termination_eps = 1e-5
|
||||
self.warp_mode = cv2.MOTION_EUCLIDEAN
|
||||
self.criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, number_of_iterations, termination_eps)
|
||||
|
||||
elif self.method == 'file' or self.method == 'files':
|
||||
seqName = verbose[0]
|
||||
ablation = verbose[1]
|
||||
if ablation:
|
||||
filePath = r'tracker/GMC_files/MOT17_ablation'
|
||||
else:
|
||||
filePath = r'tracker/GMC_files/MOTChallenge'
|
||||
|
||||
if '-FRCNN' in seqName:
|
||||
seqName = seqName[:-6]
|
||||
elif '-DPM' in seqName:
|
||||
seqName = seqName[:-4]
|
||||
elif '-SDP' in seqName:
|
||||
seqName = seqName[:-4]
|
||||
|
||||
self.gmcFile = open(filePath + "/GMC-" + seqName + ".txt", 'r')
|
||||
|
||||
if self.gmcFile is None:
|
||||
raise ValueError("Error: Unable to open GMC file in directory:" + filePath)
|
||||
elif self.method == 'none' or self.method == 'None':
|
||||
self.method = 'none'
|
||||
else:
|
||||
raise ValueError("Error: Unknown CMC method:" + method)
|
||||
|
||||
self.prevFrame = None
|
||||
self.prevKeyPoints = None
|
||||
self.prevDescriptors = None
|
||||
|
||||
self.initializedFirstFrame = False
|
||||
|
||||
def apply(self, raw_frame, detections=None):
|
||||
if self.method == 'orb' or self.method == 'sift':
|
||||
return self.applyFeaures(raw_frame, detections)
|
||||
elif self.method == 'ecc':
|
||||
return self.applyEcc(raw_frame, detections)
|
||||
elif self.method == 'file':
|
||||
return self.applyFile(raw_frame, detections)
|
||||
elif self.method == 'none':
|
||||
return np.eye(2, 3)
|
||||
else:
|
||||
return np.eye(2, 3)
|
||||
|
||||
def applyEcc(self, raw_frame, detections=None):
|
||||
|
||||
# Initialize
|
||||
height, width, _ = raw_frame.shape
|
||||
frame = cv2.cvtColor(raw_frame, cv2.COLOR_BGR2GRAY)
|
||||
H = np.eye(2, 3, dtype=np.float32)
|
||||
|
||||
# Downscale image (TODO: consider using pyramids)
|
||||
if self.downscale > 1.0:
|
||||
frame = cv2.GaussianBlur(frame, (3, 3), 1.5)
|
||||
frame = cv2.resize(frame, (width // self.downscale, height // self.downscale))
|
||||
width = width // self.downscale
|
||||
height = height // self.downscale
|
||||
|
||||
# Handle first frame
|
||||
if not self.initializedFirstFrame:
|
||||
# Initialize data
|
||||
self.prevFrame = frame.copy()
|
||||
|
||||
# Initialization done
|
||||
self.initializedFirstFrame = True
|
||||
|
||||
return H
|
||||
|
||||
# Run the ECC algorithm. The results are stored in warp_matrix.
|
||||
# (cc, H) = cv2.findTransformECC(self.prevFrame, frame, H, self.warp_mode, self.criteria)
|
||||
try:
|
||||
(cc, H) = cv2.findTransformECC(self.prevFrame, frame, H, self.warp_mode, self.criteria, None, 1)
|
||||
except:
|
||||
print('Warning: find transform failed. Set warp as identity')
|
||||
|
||||
return H
|
||||
|
||||
def applyFeaures(self, raw_frame, detections=None):
|
||||
|
||||
# Initialize
|
||||
height, width, _ = raw_frame.shape
|
||||
frame = cv2.cvtColor(raw_frame, cv2.COLOR_BGR2GRAY)
|
||||
H = np.eye(2, 3)
|
||||
|
||||
# Downscale image (TODO: consider using pyramids)
|
||||
if self.downscale > 1.0:
|
||||
# frame = cv2.GaussianBlur(frame, (3, 3), 1.5)
|
||||
frame = cv2.resize(frame, (width // self.downscale, height // self.downscale))
|
||||
width = width // self.downscale
|
||||
height = height // self.downscale
|
||||
|
||||
# find the keypoints
|
||||
mask = np.zeros_like(frame)
|
||||
# mask[int(0.05 * height): int(0.95 * height), int(0.05 * width): int(0.95 * width)] = 255
|
||||
mask[int(0.02 * height): int(0.98 * height), int(0.02 * width): int(0.98 * width)] = 255
|
||||
if detections is not None:
|
||||
for det in detections:
|
||||
tlbr = (det[:4] / self.downscale).astype(np.int_)
|
||||
mask[tlbr[1]:tlbr[3], tlbr[0]:tlbr[2]] = 0
|
||||
|
||||
keypoints = self.detector.detect(frame, mask)
|
||||
|
||||
# compute the descriptors
|
||||
keypoints, descriptors = self.extractor.compute(frame, keypoints)
|
||||
|
||||
# Handle first frame
|
||||
if not self.initializedFirstFrame:
|
||||
# Initialize data
|
||||
self.prevFrame = frame.copy()
|
||||
self.prevKeyPoints = copy.copy(keypoints)
|
||||
self.prevDescriptors = copy.copy(descriptors)
|
||||
|
||||
# Initialization done
|
||||
self.initializedFirstFrame = True
|
||||
|
||||
return H
|
||||
|
||||
# Match descriptors.
|
||||
knnMatches = self.matcher.knnMatch(self.prevDescriptors, descriptors, 2)
|
||||
|
||||
# Filtered matches based on smallest spatial distance
|
||||
matches = []
|
||||
spatialDistances = []
|
||||
|
||||
maxSpatialDistance = 0.25 * np.array([width, height])
|
||||
|
||||
# Handle empty matches case
|
||||
if len(knnMatches) == 0:
|
||||
# Store to next iteration
|
||||
self.prevFrame = frame.copy()
|
||||
self.prevKeyPoints = copy.copy(keypoints)
|
||||
self.prevDescriptors = copy.copy(descriptors)
|
||||
|
||||
return H
|
||||
|
||||
for m, n in knnMatches:
|
||||
if m.distance < 0.9 * n.distance:
|
||||
prevKeyPointLocation = self.prevKeyPoints[m.queryIdx].pt
|
||||
currKeyPointLocation = keypoints[m.trainIdx].pt
|
||||
|
||||
spatialDistance = (prevKeyPointLocation[0] - currKeyPointLocation[0],
|
||||
prevKeyPointLocation[1] - currKeyPointLocation[1])
|
||||
|
||||
if (np.abs(spatialDistance[0]) < maxSpatialDistance[0]) and \
|
||||
(np.abs(spatialDistance[1]) < maxSpatialDistance[1]):
|
||||
spatialDistances.append(spatialDistance)
|
||||
matches.append(m)
|
||||
|
||||
meanSpatialDistances = np.mean(spatialDistances, 0)
|
||||
stdSpatialDistances = np.std(spatialDistances, 0)
|
||||
|
||||
inliesrs = (spatialDistances - meanSpatialDistances) < 2.5 * stdSpatialDistances
|
||||
|
||||
goodMatches = []
|
||||
prevPoints = []
|
||||
currPoints = []
|
||||
for i in range(len(matches)):
|
||||
if inliesrs[i, 0] and inliesrs[i, 1]:
|
||||
goodMatches.append(matches[i])
|
||||
prevPoints.append(self.prevKeyPoints[matches[i].queryIdx].pt)
|
||||
currPoints.append(keypoints[matches[i].trainIdx].pt)
|
||||
|
||||
prevPoints = np.array(prevPoints)
|
||||
currPoints = np.array(currPoints)
|
||||
|
||||
# Draw the keypoint matches on the output image
|
||||
if 0:
|
||||
matches_img = np.hstack((self.prevFrame, frame))
|
||||
matches_img = cv2.cvtColor(matches_img, cv2.COLOR_GRAY2BGR)
|
||||
W = np.size(self.prevFrame, 1)
|
||||
for m in goodMatches:
|
||||
prev_pt = np.array(self.prevKeyPoints[m.queryIdx].pt, dtype=np.int_)
|
||||
curr_pt = np.array(keypoints[m.trainIdx].pt, dtype=np.int_)
|
||||
curr_pt[0] += W
|
||||
color = np.random.randint(0, 255, (3,))
|
||||
color = (int(color[0]), int(color[1]), int(color[2]))
|
||||
|
||||
matches_img = cv2.line(matches_img, prev_pt, curr_pt, tuple(color), 1, cv2.LINE_AA)
|
||||
matches_img = cv2.circle(matches_img, prev_pt, 2, tuple(color), -1)
|
||||
matches_img = cv2.circle(matches_img, curr_pt, 2, tuple(color), -1)
|
||||
|
||||
plt.figure()
|
||||
plt.imshow(matches_img)
|
||||
plt.show()
|
||||
|
||||
# Find rigid matrix
|
||||
if (np.size(prevPoints, 0) > 4) and (np.size(prevPoints, 0) == np.size(prevPoints, 0)):
|
||||
H, inliesrs = cv2.estimateAffinePartial2D(prevPoints, currPoints, cv2.RANSAC)
|
||||
|
||||
# Handle downscale
|
||||
if self.downscale > 1.0:
|
||||
H[0, 2] *= self.downscale
|
||||
H[1, 2] *= self.downscale
|
||||
else:
|
||||
print('Warning: not enough matching points')
|
||||
|
||||
# Store to next iteration
|
||||
self.prevFrame = frame.copy()
|
||||
self.prevKeyPoints = copy.copy(keypoints)
|
||||
self.prevDescriptors = copy.copy(descriptors)
|
||||
|
||||
return H
|
||||
|
||||
def applyFile(self, raw_frame, detections=None):
|
||||
line = self.gmcFile.readline()
|
||||
tokens = line.split("\t")
|
||||
H = np.eye(2, 3, dtype=np.float_)
|
||||
H[0, 0] = float(tokens[1])
|
||||
H[0, 1] = float(tokens[2])
|
||||
H[0, 2] = float(tokens[3])
|
||||
H[1, 0] = float(tokens[4])
|
||||
H[1, 1] = float(tokens[5])
|
||||
H[1, 2] = float(tokens[6])
|
||||
|
||||
return H
|
||||
|
||||
@staticmethod
|
||||
def multi_gmc(stracks, H=np.eye(2, 3)):
|
||||
"""
|
||||
GMC module prediction
|
||||
:param stracks: List[Strack]
|
||||
"""
|
||||
if len(stracks) > 0:
|
||||
multi_mean = np.asarray([st.kalman_filter.kf.x.copy() for st in stracks])
|
||||
multi_covariance = np.asarray([st.kalman_filter.kf.P for st in stracks])
|
||||
|
||||
R = H[:2, :2]
|
||||
R8x8 = np.kron(np.eye(4, dtype=float), R)
|
||||
t = H[:2, 2]
|
||||
|
||||
for i, (mean, cov) in enumerate(zip(multi_mean, multi_covariance)):
|
||||
mean = R8x8.dot(mean)
|
||||
mean[:2] += t
|
||||
cov = R8x8.dot(cov).dot(R8x8.transpose())
|
||||
|
||||
stracks[i].kalman_filter.kf.x = mean
|
||||
stracks[i].kalman_filter.kf.P = cov
|
327
yolov7-tracker-example/tracker/trackers/deepsort_tracker.py
Normal file
327
yolov7-tracker-example/tracker/trackers/deepsort_tracker.py
Normal file
@ -0,0 +1,327 @@
|
||||
"""
|
||||
Deep Sort
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
from torchvision.ops import nms
|
||||
|
||||
import cv2
|
||||
import torchvision.transforms as T
|
||||
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .tracklet import Tracklet, Tracklet_w_reid
|
||||
from .matching import *
|
||||
|
||||
from .reid_models.OSNet import *
|
||||
from .reid_models.load_model_tools import load_pretrained_weights
|
||||
from .reid_models.deepsort_reid import Extractor
|
||||
|
||||
REID_MODEL_DICT = {
|
||||
'osnet_x1_0': osnet_x1_0,
|
||||
'osnet_x0_75': osnet_x0_75,
|
||||
'osnet_x0_5': osnet_x0_5,
|
||||
'osnet_x0_25': osnet_x0_25,
|
||||
'deepsort': Extractor
|
||||
}
|
||||
|
||||
|
||||
def load_reid_model(reid_model, reid_model_path):
|
||||
|
||||
if 'osnet' in reid_model:
|
||||
func = REID_MODEL_DICT[reid_model]
|
||||
model = func(num_classes=1, pretrained=False, )
|
||||
load_pretrained_weights(model, reid_model_path)
|
||||
model.cuda().eval()
|
||||
|
||||
elif 'deepsort' in reid_model:
|
||||
model = REID_MODEL_DICT[reid_model](reid_model_path, use_cuda=True)
|
||||
|
||||
else:
|
||||
raise NotImplementedError
|
||||
|
||||
return model
|
||||
|
||||
|
||||
class DeepSortTracker(object):
|
||||
|
||||
def __init__(self, args, frame_rate=30):
|
||||
self.tracked_tracklets = [] # type: list[Tracklet]
|
||||
self.lost_tracklets = [] # type: list[Tracklet]
|
||||
self.removed_tracklets = [] # type: list[Tracklet]
|
||||
|
||||
self.frame_id = 0
|
||||
self.args = args
|
||||
|
||||
self.det_thresh = args.conf_thresh + 0.1
|
||||
self.buffer_size = int(frame_rate / 30.0 * args.track_buffer)
|
||||
self.max_time_lost = self.buffer_size
|
||||
|
||||
self.motion = args.kalman_format
|
||||
|
||||
self.with_reid = not args.discard_reid
|
||||
|
||||
self.reid_model, self.crop_transforms = None, None
|
||||
if self.with_reid:
|
||||
self.reid_model = load_reid_model(args.reid_model, args.reid_model_path)
|
||||
self.crop_transforms = T.Compose([
|
||||
# T.ToPILImage(),
|
||||
# T.Resize(size=(256, 128)),
|
||||
T.ToTensor(), # (c, 128, 256)
|
||||
T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
|
||||
])
|
||||
|
||||
self.bbox_crop_size = (64, 128) if 'deepsort' in args.reid_model else (128, 128)
|
||||
|
||||
|
||||
def reid_preprocess(self, obj_bbox):
|
||||
"""
|
||||
preprocess cropped object bboxes
|
||||
|
||||
obj_bbox: np.ndarray, shape=(h_obj, w_obj, c)
|
||||
|
||||
return:
|
||||
torch.Tensor of shape (c, 128, 256)
|
||||
"""
|
||||
|
||||
obj_bbox = cv2.resize(obj_bbox.astype(np.float32) / 255.0, dsize=self.bbox_crop_size) # shape: (h, w, c)
|
||||
|
||||
return self.crop_transforms(obj_bbox)
|
||||
|
||||
def get_feature(self, tlwhs, ori_img):
|
||||
"""
|
||||
get apperance feature of an object
|
||||
tlwhs: shape (num_of_objects, 4)
|
||||
ori_img: original image, np.ndarray, shape(H, W, C)
|
||||
"""
|
||||
obj_bbox = []
|
||||
|
||||
for tlwh in tlwhs:
|
||||
tlwh = list(map(int, tlwh))
|
||||
|
||||
# limit to the legal range
|
||||
tlwh[0], tlwh[1] = max(tlwh[0], 0), max(tlwh[1], 0)
|
||||
|
||||
tlbr_tensor = self.reid_preprocess(ori_img[tlwh[1]: tlwh[1] + tlwh[3], tlwh[0]: tlwh[0] + tlwh[2]])
|
||||
|
||||
obj_bbox.append(tlbr_tensor)
|
||||
|
||||
if not obj_bbox:
|
||||
return np.array([])
|
||||
|
||||
obj_bbox = torch.stack(obj_bbox, dim=0)
|
||||
obj_bbox = obj_bbox.cuda()
|
||||
|
||||
features = self.reid_model(obj_bbox) # shape: (num_of_objects, feature_dim)
|
||||
return features.cpu().detach().numpy()
|
||||
|
||||
def update(self, output_results, img, ori_img):
|
||||
"""
|
||||
output_results: processed detections (scale to original size) tlbr format
|
||||
"""
|
||||
|
||||
self.frame_id += 1
|
||||
activated_tracklets = []
|
||||
refind_tracklets = []
|
||||
lost_tracklets = []
|
||||
removed_tracklets = []
|
||||
|
||||
scores = output_results[:, 4]
|
||||
bboxes = output_results[:, :4]
|
||||
categories = output_results[:, -1]
|
||||
|
||||
remain_inds = scores > self.args.conf_thresh
|
||||
|
||||
dets = bboxes[remain_inds]
|
||||
|
||||
cates = categories[remain_inds]
|
||||
|
||||
scores_keep = scores[remain_inds]
|
||||
|
||||
features_keep = self.get_feature(tlwhs=dets[:, :4], ori_img=ori_img)
|
||||
|
||||
if len(dets) > 0:
|
||||
'''Detections'''
|
||||
detections = [Tracklet_w_reid(tlwh, s, cate, motion=self.motion, feat=feat) for
|
||||
(tlwh, s, cate, feat) in zip(dets, scores_keep, cates, features_keep)]
|
||||
else:
|
||||
detections = []
|
||||
|
||||
''' Add newly detected tracklets to tracked_tracklets'''
|
||||
unconfirmed = []
|
||||
tracked_tracklets = [] # type: list[Tracklet]
|
||||
for track in self.tracked_tracklets:
|
||||
if not track.is_activated:
|
||||
unconfirmed.append(track)
|
||||
else:
|
||||
tracked_tracklets.append(track)
|
||||
|
||||
''' Step 2: First association, with appearance'''
|
||||
tracklet_pool = joint_tracklets(tracked_tracklets, self.lost_tracklets)
|
||||
|
||||
# Predict the current location with Kalman
|
||||
for tracklet in tracklet_pool:
|
||||
tracklet.predict()
|
||||
|
||||
|
||||
matches, u_track, u_detection = matching_cascade(distance_metric=self.gated_metric,
|
||||
matching_thresh=0.9,
|
||||
cascade_depth=30,
|
||||
tracks=tracklet_pool,
|
||||
detections=detections
|
||||
)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_pool[itracked]
|
||||
det = detections[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
'''Step 3: Second association, with iou'''
|
||||
tracklet_for_iou = [tracklet_pool[i] for i in u_track if tracklet_pool[i].state == TrackState.Tracked]
|
||||
detection_for_iou = [detections[i] for i in u_detection]
|
||||
|
||||
dists = iou_distance(tracklet_for_iou, detection_for_iou)
|
||||
|
||||
matches, u_track, u_detection = linear_assignment(dists, thresh=0.5)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_for_iou[itracked]
|
||||
det = detection_for_iou[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detection_for_iou[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
for it in u_track:
|
||||
track = tracklet_for_iou[it]
|
||||
if not track.state == TrackState.Lost:
|
||||
track.mark_lost()
|
||||
lost_tracklets.append(track)
|
||||
|
||||
|
||||
|
||||
'''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
|
||||
detections = [detection_for_iou[i] for i in u_detection]
|
||||
dists = iou_distance(unconfirmed, detections)
|
||||
|
||||
matches, u_unconfirmed, u_detection = linear_assignment(dists, thresh=0.7)
|
||||
|
||||
for itracked, idet in matches:
|
||||
unconfirmed[itracked].update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(unconfirmed[itracked])
|
||||
for it in u_unconfirmed:
|
||||
track = unconfirmed[it]
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
""" Step 4: Init new tracklets"""
|
||||
for inew in u_detection:
|
||||
track = detections[inew]
|
||||
if track.score < self.det_thresh:
|
||||
continue
|
||||
track.activate(self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
|
||||
""" Step 5: Update state"""
|
||||
for track in self.lost_tracklets:
|
||||
if self.frame_id - track.end_frame > self.max_time_lost:
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
# print('Ramained match {} s'.format(t4-t3))
|
||||
|
||||
self.tracked_tracklets = [t for t in self.tracked_tracklets if t.state == TrackState.Tracked]
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, activated_tracklets)
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, refind_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.tracked_tracklets)
|
||||
self.lost_tracklets.extend(lost_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.removed_tracklets)
|
||||
self.removed_tracklets.extend(removed_tracklets)
|
||||
self.tracked_tracklets, self.lost_tracklets = remove_duplicate_tracklets(self.tracked_tracklets, self.lost_tracklets)
|
||||
# get scores of lost tracks
|
||||
output_tracklets = [track for track in self.tracked_tracklets if track.is_activated]
|
||||
|
||||
return output_tracklets
|
||||
|
||||
def gated_metric(self, tracks, dets):
|
||||
"""
|
||||
get cost matrix, firstly calculate apperence cost, then filter by Kalman state.
|
||||
|
||||
tracks: List[STrack]
|
||||
dets: List[STrack]
|
||||
"""
|
||||
apperance_dist = nearest_embedding_distance(tracks=tracks, detections=dets, metric='cosine')
|
||||
cost_matrix = self.gate_cost_matrix(apperance_dist, tracks, dets, )
|
||||
return cost_matrix
|
||||
|
||||
def gate_cost_matrix(self, cost_matrix, tracks, dets, max_apperance_thresh=0.15, gated_cost=1e5, only_position=False):
|
||||
"""
|
||||
gate cost matrix by calculating the Kalman state distance and constrainted by
|
||||
0.95 confidence interval of x2 distribution
|
||||
|
||||
cost_matrix: np.ndarray, shape (len(tracks), len(dets))
|
||||
tracks: List[STrack]
|
||||
dets: List[STrack]
|
||||
gated_cost: a very largt const to infeasible associations
|
||||
only_position: use [xc, yc, a, h] as state vector or only use [xc, yc]
|
||||
|
||||
return:
|
||||
updated cost_matirx, np.ndarray
|
||||
"""
|
||||
gating_dim = 2 if only_position else 4
|
||||
gating_threshold = chi2inv95[gating_dim]
|
||||
measurements = np.asarray([Tracklet.tlwh_to_xyah(det.tlwh) for det in dets]) # (len(dets), 4)
|
||||
|
||||
cost_matrix[cost_matrix > max_apperance_thresh] = gated_cost
|
||||
for row, track in enumerate(tracks):
|
||||
gating_distance = track.kalman_filter.gating_distance(measurements, )
|
||||
cost_matrix[row, gating_distance > gating_threshold] = gated_cost
|
||||
return cost_matrix
|
||||
|
||||
|
||||
def joint_tracklets(tlista, tlistb):
|
||||
exists = {}
|
||||
res = []
|
||||
for t in tlista:
|
||||
exists[t.track_id] = 1
|
||||
res.append(t)
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if not exists.get(tid, 0):
|
||||
exists[tid] = 1
|
||||
res.append(t)
|
||||
return res
|
||||
|
||||
|
||||
def sub_tracklets(tlista, tlistb):
|
||||
tracklets = {}
|
||||
for t in tlista:
|
||||
tracklets[t.track_id] = t
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if tracklets.get(tid, 0):
|
||||
del tracklets[tid]
|
||||
return list(tracklets.values())
|
||||
|
||||
|
||||
def remove_duplicate_tracklets(trackletsa, trackletsb):
|
||||
pdist = iou_distance(trackletsa, trackletsb)
|
||||
pairs = np.where(pdist < 0.15)
|
||||
dupa, dupb = list(), list()
|
||||
for p, q in zip(*pairs):
|
||||
timep = trackletsa[p].frame_id - trackletsa[p].start_frame
|
||||
timeq = trackletsb[q].frame_id - trackletsb[q].start_frame
|
||||
if timep > timeq:
|
||||
dupb.append(q)
|
||||
else:
|
||||
dupa.append(p)
|
||||
resa = [t for i, t in enumerate(trackletsa) if not i in dupa]
|
||||
resb = [t for i, t in enumerate(trackletsb) if not i in dupb]
|
||||
return resa, resb
|
@ -0,0 +1,74 @@
|
||||
from filterpy.kalman import KalmanFilter
|
||||
import numpy as np
|
||||
import scipy
|
||||
|
||||
class BaseKalman:
|
||||
|
||||
def __init__(self,
|
||||
state_dim: int = 8,
|
||||
observation_dim: int = 4,
|
||||
F: np.ndarray = np.zeros((0, )),
|
||||
P: np.ndarray = np.zeros((0, )),
|
||||
Q: np.ndarray = np.zeros((0, )),
|
||||
H: np.ndarray = np.zeros((0, )),
|
||||
R: np.ndarray = np.zeros((0, )),
|
||||
) -> None:
|
||||
|
||||
self.kf = KalmanFilter(dim_x=state_dim, dim_z=observation_dim, dim_u=0)
|
||||
if F.shape[0] > 0: self.kf.F = F # if valid
|
||||
if P.shape[0] > 0: self.kf.P = P
|
||||
if Q.shape[0] > 0: self.kf.Q = Q
|
||||
if H.shape[0] > 0: self.kf.H = H
|
||||
if R.shape[0] > 0: self.kf.R = R
|
||||
|
||||
def initialize(self, observation):
|
||||
return NotImplementedError
|
||||
|
||||
def predict(self, ):
|
||||
self.kf.predict()
|
||||
|
||||
def update(self, observation, **kwargs):
|
||||
self.kf.update(observation, self.R, self.H)
|
||||
|
||||
def get_state(self, ):
|
||||
return self.kf.x
|
||||
|
||||
def gating_distance(self, measurements, only_position=False):
|
||||
"""Compute gating distance between state distribution and measurements.
|
||||
A suitable distance threshold can be obtained from `chi2inv95`. If
|
||||
`only_position` is False, the chi-square distribution has 4 degrees of
|
||||
freedom, otherwise 2.
|
||||
Parameters
|
||||
----------
|
||||
measurements : ndarray
|
||||
An Nx4 dimensional matrix of N measurements, note the format (whether xywh or xyah or others)
|
||||
should be identical to state definition
|
||||
only_position : Optional[bool]
|
||||
If True, distance computation is done with respect to the bounding
|
||||
box center position only.
|
||||
Returns
|
||||
-------
|
||||
ndarray
|
||||
Returns an array of length N, where the i-th element contains the
|
||||
squared Mahalanobis distance between (mean, covariance) and
|
||||
`measurements[i]`.
|
||||
"""
|
||||
|
||||
# map state space to measurement space
|
||||
mean = self.kf.x.copy()
|
||||
mean = np.dot(self.kf.H, mean)
|
||||
covariance = np.linalg.multi_dot((self.kf.H, self.kf.P, self.kf.H.T))
|
||||
|
||||
if only_position:
|
||||
mean, covariance = mean[:2], covariance[:2, :2]
|
||||
measurements = measurements[:, :2]
|
||||
|
||||
cholesky_factor = np.linalg.cholesky(covariance)
|
||||
d = measurements - mean
|
||||
z = scipy.linalg.solve_triangular(
|
||||
cholesky_factor, d.T, lower=True, check_finite=False,
|
||||
overwrite_b=True)
|
||||
squared_maha = np.sum(z * z, axis=0)
|
||||
return squared_maha
|
||||
|
||||
|
@ -0,0 +1,99 @@
|
||||
from numpy.core.multiarray import zeros as zeros
|
||||
from .base_kalman import BaseKalman
|
||||
import numpy as np
|
||||
import cv2
|
||||
|
||||
class BotKalman(BaseKalman):
|
||||
|
||||
def __init__(self, ):
|
||||
|
||||
state_dim = 8 # [x, y, w, h, vx, vy, vw, vh]
|
||||
observation_dim = 4
|
||||
|
||||
F = np.eye(state_dim, state_dim)
|
||||
'''
|
||||
[1, 0, 0, 0, 1, 0, 0]
|
||||
[0, 1, 0, 0, 0, 1, 0]
|
||||
...
|
||||
'''
|
||||
for i in range(state_dim // 2):
|
||||
F[i, i + state_dim // 2] = 1
|
||||
|
||||
H = np.eye(state_dim // 2, state_dim)
|
||||
|
||||
super().__init__(state_dim=state_dim,
|
||||
observation_dim=observation_dim,
|
||||
F=F,
|
||||
H=H)
|
||||
|
||||
self._std_weight_position = 1. / 20
|
||||
self._std_weight_velocity = 1. / 160
|
||||
|
||||
def initialize(self, observation):
|
||||
""" init x, P, Q, R
|
||||
|
||||
Args:
|
||||
observation: x-y-w-h format
|
||||
"""
|
||||
# init x, P, Q, R
|
||||
|
||||
mean_pos = observation
|
||||
mean_vel = np.zeros_like(observation)
|
||||
self.kf.x = np.r_[mean_pos, mean_vel] # x_{0, 0}
|
||||
|
||||
std = [
|
||||
2 * self._std_weight_position * observation[2], # related to h
|
||||
2 * self._std_weight_position * observation[3],
|
||||
2 * self._std_weight_position * observation[2],
|
||||
2 * self._std_weight_position * observation[3],
|
||||
10 * self._std_weight_velocity * observation[2],
|
||||
10 * self._std_weight_velocity * observation[3],
|
||||
10 * self._std_weight_velocity * observation[2],
|
||||
10 * self._std_weight_velocity * observation[3],
|
||||
]
|
||||
|
||||
self.kf.P = np.diag(np.square(std)) # P_{0, 0}
|
||||
|
||||
def predict(self, ):
|
||||
""" predict step
|
||||
|
||||
x_{n + 1, n} = F * x_{n, n}
|
||||
P_{n + 1, n} = F * P_{n, n} * F^T + Q
|
||||
|
||||
"""
|
||||
std_pos = [
|
||||
self._std_weight_position * self.kf.x[2],
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
self._std_weight_position * self.kf.x[2],
|
||||
self._std_weight_position * self.kf.x[3]]
|
||||
std_vel = [
|
||||
self._std_weight_velocity * self.kf.x[2],
|
||||
self._std_weight_velocity * self.kf.x[3],
|
||||
self._std_weight_velocity * self.kf.x[2],
|
||||
self._std_weight_velocity * self.kf.x[3]]
|
||||
|
||||
Q = np.diag(np.square(np.r_[std_pos, std_vel]))
|
||||
|
||||
self.kf.predict(Q=Q)
|
||||
|
||||
def update(self, z):
|
||||
""" update step
|
||||
|
||||
Args:
|
||||
z: observation x-y-a-h format
|
||||
|
||||
K_n = P_{n, n - 1} * H^T * (H P_{n, n - 1} H^T + R)^{-1}
|
||||
x_{n, n} = x_{n, n - 1} + K_n * (z - H * x_{n, n - 1})
|
||||
P_{n, n} = (I - K_n * H) P_{n, n - 1} (I - K_n * H)^T + K_n R_n
|
||||
|
||||
"""
|
||||
|
||||
std = [
|
||||
self._std_weight_position * self.kf.x[2],
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
self._std_weight_position * self.kf.x[2],
|
||||
self._std_weight_position * self.kf.x[3]]
|
||||
|
||||
R = np.diag(np.square(std))
|
||||
|
||||
self.kf.update(z=z, R=R)
|
@ -0,0 +1,97 @@
|
||||
from .base_kalman import BaseKalman
|
||||
import numpy as np
|
||||
|
||||
class ByteKalman(BaseKalman):
|
||||
|
||||
def __init__(self, ):
|
||||
|
||||
state_dim = 8 # [x, y, a, h, vx, vy, va, vh]
|
||||
observation_dim = 4
|
||||
|
||||
F = np.eye(state_dim, state_dim)
|
||||
'''
|
||||
[1, 0, 0, 0, 1, 0, 0]
|
||||
[0, 1, 0, 0, 0, 1, 0]
|
||||
...
|
||||
'''
|
||||
for i in range(state_dim // 2):
|
||||
F[i, i + state_dim // 2] = 1
|
||||
|
||||
H = np.eye(state_dim // 2, state_dim)
|
||||
|
||||
super().__init__(state_dim=state_dim,
|
||||
observation_dim=observation_dim,
|
||||
F=F,
|
||||
H=H)
|
||||
|
||||
self._std_weight_position = 1. / 20
|
||||
self._std_weight_velocity = 1. / 160
|
||||
|
||||
def initialize(self, observation):
|
||||
""" init x, P, Q, R
|
||||
|
||||
Args:
|
||||
observation: x-y-a-h format
|
||||
"""
|
||||
# init x, P, Q, R
|
||||
|
||||
mean_pos = observation
|
||||
mean_vel = np.zeros_like(observation)
|
||||
self.kf.x = np.r_[mean_pos, mean_vel] # x_{0, 0}
|
||||
|
||||
std = [
|
||||
2 * self._std_weight_position * observation[3], # related to h
|
||||
2 * self._std_weight_position * observation[3],
|
||||
1e-2,
|
||||
2 * self._std_weight_position * observation[3],
|
||||
10 * self._std_weight_velocity * observation[3],
|
||||
10 * self._std_weight_velocity * observation[3],
|
||||
1e-5,
|
||||
10 * self._std_weight_velocity * observation[3],
|
||||
]
|
||||
|
||||
self.kf.P = np.diag(np.square(std)) # P_{0, 0}
|
||||
|
||||
def predict(self, ):
|
||||
""" predict step
|
||||
|
||||
x_{n + 1, n} = F * x_{n, n}
|
||||
P_{n + 1, n} = F * P_{n, n} * F^T + Q
|
||||
|
||||
"""
|
||||
std_pos = [
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
1e-2,
|
||||
self._std_weight_position * self.kf.x[3]]
|
||||
std_vel = [
|
||||
self._std_weight_velocity * self.kf.x[3],
|
||||
self._std_weight_velocity * self.kf.x[3],
|
||||
1e-5,
|
||||
self._std_weight_velocity * self.kf.x[3]]
|
||||
|
||||
Q = np.diag(np.square(np.r_[std_pos, std_vel]))
|
||||
|
||||
self.kf.predict(Q=Q)
|
||||
|
||||
def update(self, z):
|
||||
""" update step
|
||||
|
||||
Args:
|
||||
z: observation x-y-a-h format
|
||||
|
||||
K_n = P_{n, n - 1} * H^T * (H P_{n, n - 1} H^T + R)^{-1}
|
||||
x_{n, n} = x_{n, n - 1} + K_n * (z - H * x_{n, n - 1})
|
||||
P_{n, n} = (I - K_n * H) P_{n, n - 1} (I - K_n * H)^T + K_n R_n
|
||||
|
||||
"""
|
||||
|
||||
std = [
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
1e-1,
|
||||
self._std_weight_position * self.kf.x[3]]
|
||||
|
||||
R = np.diag(np.square(std))
|
||||
|
||||
self.kf.update(z=z, R=R)
|
@ -0,0 +1,144 @@
|
||||
from numpy.core.multiarray import zeros as zeros
|
||||
from .base_kalman import BaseKalman
|
||||
import numpy as np
|
||||
from copy import deepcopy
|
||||
|
||||
class OCSORTKalman(BaseKalman):
|
||||
|
||||
def __init__(self, ):
|
||||
|
||||
state_dim = 7 # [x, y, s, a, vx, vy, vs] s: area
|
||||
observation_dim = 4
|
||||
|
||||
F = np.array([[1, 0, 0, 0, 1, 0, 0],
|
||||
[0, 1, 0, 0, 0, 1, 0],
|
||||
[0, 0, 1, 0, 0, 0, 1],
|
||||
[0, 0, 0, 1, 0, 0, 0],
|
||||
[0, 0, 0, 0, 1, 0, 0],
|
||||
[0, 0, 0, 0, 0, 1, 0],
|
||||
[0, 0, 0, 0, 0, 0, 1]])
|
||||
|
||||
H = np.eye(state_dim // 2 + 1, state_dim)
|
||||
|
||||
super().__init__(state_dim=state_dim,
|
||||
observation_dim=observation_dim,
|
||||
F=F,
|
||||
H=H)
|
||||
|
||||
# TODO check
|
||||
# give high uncertainty to the unobservable initial velocities
|
||||
self.kf.R[2:, 2:] *= 10 # [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 10, 0], [0, 0, 0, 10]]
|
||||
self.kf.P[4:, 4:] *= 1000
|
||||
self.kf.P *= 10
|
||||
self.kf.Q[-1, -1] *= 0.01
|
||||
self.kf.Q[4:, 4:] *= 0.01
|
||||
|
||||
# keep all observations
|
||||
self.history_obs = []
|
||||
self.attr_saved = None
|
||||
self.observed = False
|
||||
|
||||
def initialize(self, observation):
|
||||
"""
|
||||
Args:
|
||||
observation: x-y-s-a
|
||||
"""
|
||||
self.kf.x = self.kf.x.flatten()
|
||||
self.kf.x[:4] = observation
|
||||
|
||||
|
||||
def predict(self, ):
|
||||
""" predict step
|
||||
|
||||
"""
|
||||
|
||||
# s + vs
|
||||
if (self.kf.x[6] + self.kf.x[2] <= 0):
|
||||
self.kf.x[6] *= 0.0
|
||||
|
||||
self.kf.predict()
|
||||
|
||||
def _freeze(self, ):
|
||||
""" freeze all the param of Kalman
|
||||
|
||||
"""
|
||||
self.attr_saved = deepcopy(self.kf.__dict__)
|
||||
|
||||
def _unfreeze(self, ):
|
||||
""" when observe an lost object again, use the virtual trajectory
|
||||
|
||||
"""
|
||||
if self.attr_saved is not None:
|
||||
new_history = deepcopy(self.history_obs)
|
||||
self.kf.__dict__ = self.attr_saved
|
||||
|
||||
self.history_obs = self.history_obs[:-1]
|
||||
|
||||
occur = [int(d is None) for d in new_history]
|
||||
indices = np.where(np.array(occur)==0)[0]
|
||||
index1 = indices[-2]
|
||||
index2 = indices[-1]
|
||||
box1 = new_history[index1]
|
||||
x1, y1, s1, r1 = box1
|
||||
w1 = np.sqrt(s1 * r1)
|
||||
h1 = np.sqrt(s1 / r1)
|
||||
box2 = new_history[index2]
|
||||
x2, y2, s2, r2 = box2
|
||||
w2 = np.sqrt(s2 * r2)
|
||||
h2 = np.sqrt(s2 / r2)
|
||||
time_gap = index2 - index1
|
||||
dx = (x2-x1)/time_gap
|
||||
dy = (y2-y1)/time_gap
|
||||
dw = (w2-w1)/time_gap
|
||||
dh = (h2-h1)/time_gap
|
||||
for i in range(index2 - index1):
|
||||
"""
|
||||
The default virtual trajectory generation is by linear
|
||||
motion (constant speed hypothesis), you could modify this
|
||||
part to implement your own.
|
||||
"""
|
||||
x = x1 + (i+1) * dx
|
||||
y = y1 + (i+1) * dy
|
||||
w = w1 + (i+1) * dw
|
||||
h = h1 + (i+1) * dh
|
||||
s = w * h
|
||||
r = w / float(h)
|
||||
new_box = np.array([x, y, s, r]).reshape((4, 1))
|
||||
"""
|
||||
I still use predict-update loop here to refresh the parameters,
|
||||
but this can be faster by directly modifying the internal parameters
|
||||
as suggested in the paper. I keep this naive but slow way for
|
||||
easy read and understanding
|
||||
"""
|
||||
self.kf.update(new_box)
|
||||
if not i == (index2-index1-1):
|
||||
self.kf.predict()
|
||||
|
||||
|
||||
def update(self, z):
|
||||
""" update step
|
||||
|
||||
For simplicity, directly change the self.kf as OCSORT modify the intrinsic Kalman
|
||||
|
||||
Args:
|
||||
z: observation x-y-s-a format
|
||||
"""
|
||||
|
||||
self.history_obs.append(z)
|
||||
|
||||
if z is None:
|
||||
if self.observed:
|
||||
self._freeze()
|
||||
self.observed = False
|
||||
|
||||
self.kf.update(z)
|
||||
|
||||
else:
|
||||
if not self.observed: # Get observation, use online smoothing to re-update parameters
|
||||
self._unfreeze()
|
||||
|
||||
self.kf.update(z)
|
||||
|
||||
self.observed = True
|
||||
|
||||
|
@ -0,0 +1,73 @@
|
||||
from numpy.core.multiarray import zeros as zeros
|
||||
from .base_kalman import BaseKalman
|
||||
import numpy as np
|
||||
from copy import deepcopy
|
||||
|
||||
class SORTKalman(BaseKalman):
|
||||
|
||||
def __init__(self, ):
|
||||
|
||||
state_dim = 7 # [x, y, s, a, vx, vy, vs] s: area
|
||||
observation_dim = 4
|
||||
|
||||
F = np.array([[1, 0, 0, 0, 1, 0, 0],
|
||||
[0, 1, 0, 0, 0, 1, 0],
|
||||
[0, 0, 1, 0, 0, 0, 1],
|
||||
[0, 0, 0, 1, 0, 0, 0],
|
||||
[0, 0, 0, 0, 1, 0, 0],
|
||||
[0, 0, 0, 0, 0, 1, 0],
|
||||
[0, 0, 0, 0, 0, 0, 1]])
|
||||
|
||||
H = np.eye(state_dim // 2 + 1, state_dim)
|
||||
|
||||
super().__init__(state_dim=state_dim,
|
||||
observation_dim=observation_dim,
|
||||
F=F,
|
||||
H=H)
|
||||
|
||||
# TODO check
|
||||
# give high uncertainty to the unobservable initial velocities
|
||||
self.kf.R[2:, 2:] *= 10 # [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 10, 0], [0, 0, 0, 10]]
|
||||
self.kf.P[4:, 4:] *= 1000
|
||||
self.kf.P *= 10
|
||||
self.kf.Q[-1, -1] *= 0.01
|
||||
self.kf.Q[4:, 4:] *= 0.01
|
||||
|
||||
# keep all observations
|
||||
self.history_obs = []
|
||||
self.attr_saved = None
|
||||
self.observed = False
|
||||
|
||||
def initialize(self, observation):
|
||||
"""
|
||||
Args:
|
||||
observation: x-y-s-a
|
||||
"""
|
||||
self.kf.x = self.kf.x.flatten()
|
||||
self.kf.x[:4] = observation
|
||||
|
||||
|
||||
def predict(self, ):
|
||||
""" predict step
|
||||
|
||||
"""
|
||||
|
||||
# s + vs
|
||||
if (self.kf.x[6] + self.kf.x[2] <= 0):
|
||||
self.kf.x[6] *= 0.0
|
||||
|
||||
self.kf.predict()
|
||||
|
||||
def update(self, z):
|
||||
""" update step
|
||||
|
||||
For simplicity, directly change the self.kf as OCSORT modify the intrinsic Kalman
|
||||
|
||||
Args:
|
||||
z: observation x-y-s-a format
|
||||
"""
|
||||
|
||||
self.kf.update(z)
|
||||
|
||||
|
||||
|
@ -0,0 +1,101 @@
|
||||
from .base_kalman import BaseKalman
|
||||
import numpy as np
|
||||
|
||||
class NSAKalman(BaseKalman):
|
||||
|
||||
def __init__(self, ):
|
||||
|
||||
state_dim = 8 # [x, y, a, h, vx, vy, va, vh]
|
||||
observation_dim = 4
|
||||
|
||||
F = np.eye(state_dim, state_dim)
|
||||
'''
|
||||
[1, 0, 0, 0, 1, 0, 0]
|
||||
[0, 1, 0, 0, 0, 1, 0]
|
||||
...
|
||||
'''
|
||||
for i in range(state_dim // 2):
|
||||
F[i, i + state_dim // 2] = 1
|
||||
|
||||
H = np.eye(state_dim // 2, state_dim)
|
||||
|
||||
super().__init__(state_dim=state_dim,
|
||||
observation_dim=observation_dim,
|
||||
F=F,
|
||||
H=H)
|
||||
|
||||
self._std_weight_position = 1. / 20
|
||||
self._std_weight_velocity = 1. / 160
|
||||
|
||||
def initialize(self, observation):
|
||||
""" init x, P, Q, R
|
||||
|
||||
Args:
|
||||
observation: x-y-a-h format
|
||||
"""
|
||||
# init x, P, Q, R
|
||||
|
||||
mean_pos = observation
|
||||
mean_vel = np.zeros_like(observation)
|
||||
self.kf.x = np.r_[mean_pos, mean_vel] # x_{0, 0}
|
||||
|
||||
std = [
|
||||
2 * self._std_weight_position * observation[3], # related to h
|
||||
2 * self._std_weight_position * observation[3],
|
||||
1e-2,
|
||||
2 * self._std_weight_position * observation[3],
|
||||
10 * self._std_weight_velocity * observation[3],
|
||||
10 * self._std_weight_velocity * observation[3],
|
||||
1e-5,
|
||||
10 * self._std_weight_velocity * observation[3],
|
||||
]
|
||||
|
||||
self.kf.P = np.diag(np.square(std)) # P_{0, 0}
|
||||
|
||||
def predict(self, ):
|
||||
""" predict step
|
||||
|
||||
x_{n + 1, n} = F * x_{n, n}
|
||||
P_{n + 1, n} = F * P_{n, n} * F^T + Q
|
||||
|
||||
"""
|
||||
std_pos = [
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
1e-2,
|
||||
self._std_weight_position * self.kf.x[3]]
|
||||
std_vel = [
|
||||
self._std_weight_velocity * self.kf.x[3],
|
||||
self._std_weight_velocity * self.kf.x[3],
|
||||
1e-5,
|
||||
self._std_weight_velocity * self.kf.x[3]]
|
||||
|
||||
Q = np.diag(np.square(np.r_[std_pos, std_vel]))
|
||||
|
||||
self.kf.predict(Q=Q)
|
||||
|
||||
def update(self, z, score):
|
||||
""" update step
|
||||
|
||||
Args:
|
||||
z: observation x-y-a-h format
|
||||
score: the detection score/confidence required by NSA kalman
|
||||
|
||||
K_n = P_{n, n - 1} * H^T * (H P_{n, n - 1} H^T + R)^{-1}
|
||||
x_{n, n} = x_{n, n - 1} + K_n * (z - H * x_{n, n - 1})
|
||||
P_{n, n} = (I - K_n * H) P_{n, n - 1} (I - K_n * H)^T + K_n R_n
|
||||
|
||||
"""
|
||||
|
||||
std = [
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
self._std_weight_position * self.kf.x[3],
|
||||
1e-1,
|
||||
self._std_weight_position * self.kf.x[3]]
|
||||
|
||||
# NSA
|
||||
std = [(1. - score) * x for x in std]
|
||||
|
||||
R = np.diag(np.square(std))
|
||||
|
||||
self.kf.update(z=z, R=R)
|
@ -0,0 +1,27 @@
|
||||
from .base_kalman import BaseKalman
|
||||
import numpy as np
|
||||
|
||||
class UCMCKalman(BaseKalman):
|
||||
def __init__(self, ):
|
||||
|
||||
state_dim = 8
|
||||
observation_dim = 4
|
||||
|
||||
F = np.eye(state_dim, state_dim)
|
||||
'''
|
||||
[1, 0, 0, 0, 1, 0, 0]
|
||||
[0, 1, 0, 0, 0, 1, 0]
|
||||
...
|
||||
'''
|
||||
for i in range(state_dim // 2):
|
||||
F[i, i + state_dim // 2] = 1
|
||||
|
||||
H = np.eye(state_dim // 2, state_dim)
|
||||
|
||||
super().__init__(state_dim=state_dim,
|
||||
observation_dim=observation_dim,
|
||||
F=F,
|
||||
H=H)
|
||||
|
||||
self._std_weight_position = 1. / 20
|
||||
self._std_weight_velocity = 1. / 160
|
388
yolov7-tracker-example/tracker/trackers/matching.py
Normal file
388
yolov7-tracker-example/tracker/trackers/matching.py
Normal file
@ -0,0 +1,388 @@
|
||||
import cv2
|
||||
import numpy as np
|
||||
import scipy
|
||||
import lap
|
||||
from scipy.spatial.distance import cdist
|
||||
import math
|
||||
from cython_bbox import bbox_overlaps as bbox_ious
|
||||
import time
|
||||
|
||||
chi2inv95 = {
|
||||
1: 3.8415,
|
||||
2: 5.9915,
|
||||
3: 7.8147,
|
||||
4: 9.4877,
|
||||
5: 11.070,
|
||||
6: 12.592,
|
||||
7: 14.067,
|
||||
8: 15.507,
|
||||
9: 16.919}
|
||||
|
||||
|
||||
def merge_matches(m1, m2, shape):
|
||||
O,P,Q = shape
|
||||
m1 = np.asarray(m1)
|
||||
m2 = np.asarray(m2)
|
||||
|
||||
M1 = scipy.sparse.coo_matrix((np.ones(len(m1)), (m1[:, 0], m1[:, 1])), shape=(O, P))
|
||||
M2 = scipy.sparse.coo_matrix((np.ones(len(m2)), (m2[:, 0], m2[:, 1])), shape=(P, Q))
|
||||
|
||||
mask = M1*M2
|
||||
match = mask.nonzero()
|
||||
match = list(zip(match[0], match[1]))
|
||||
unmatched_O = tuple(set(range(O)) - set([i for i, j in match]))
|
||||
unmatched_Q = tuple(set(range(Q)) - set([j for i, j in match]))
|
||||
|
||||
return match, unmatched_O, unmatched_Q
|
||||
|
||||
|
||||
def _indices_to_matches(cost_matrix, indices, thresh):
|
||||
matched_cost = cost_matrix[tuple(zip(*indices))]
|
||||
matched_mask = (matched_cost <= thresh)
|
||||
|
||||
matches = indices[matched_mask]
|
||||
unmatched_a = tuple(set(range(cost_matrix.shape[0])) - set(matches[:, 0]))
|
||||
unmatched_b = tuple(set(range(cost_matrix.shape[1])) - set(matches[:, 1]))
|
||||
|
||||
return matches, unmatched_a, unmatched_b
|
||||
|
||||
|
||||
def linear_assignment(cost_matrix, thresh):
|
||||
if cost_matrix.size == 0:
|
||||
return np.empty((0, 2), dtype=int), tuple(range(cost_matrix.shape[0])), tuple(range(cost_matrix.shape[1]))
|
||||
matches, unmatched_a, unmatched_b = [], [], []
|
||||
cost, x, y = lap.lapjv(cost_matrix, extend_cost=True, cost_limit=thresh)
|
||||
for ix, mx in enumerate(x):
|
||||
if mx >= 0:
|
||||
matches.append([ix, mx])
|
||||
unmatched_a = np.where(x < 0)[0]
|
||||
unmatched_b = np.where(y < 0)[0]
|
||||
matches = np.asarray(matches)
|
||||
return matches, unmatched_a, unmatched_b
|
||||
|
||||
|
||||
def ious(atlbrs, btlbrs):
|
||||
"""
|
||||
Compute cost based on IoU
|
||||
:type atlbrs: list[tlbr] | np.ndarray
|
||||
:type atlbrs: list[tlbr] | np.ndarray
|
||||
|
||||
:rtype ious np.ndarray
|
||||
"""
|
||||
ious = np.zeros((len(atlbrs), len(btlbrs)), dtype=np.float64)
|
||||
if ious.size == 0:
|
||||
return ious
|
||||
|
||||
ious = bbox_ious(
|
||||
np.ascontiguousarray(atlbrs, dtype=np.float64),
|
||||
np.ascontiguousarray(btlbrs, dtype=np.float64)
|
||||
)
|
||||
|
||||
return ious
|
||||
|
||||
|
||||
def iou_distance(atracks, btracks):
|
||||
"""
|
||||
Compute cost based on IoU
|
||||
:type atracks: list[STrack]
|
||||
:type btracks: list[STrack]
|
||||
|
||||
:rtype cost_matrix np.ndarray
|
||||
"""
|
||||
|
||||
if (len(atracks)>0 and isinstance(atracks[0], np.ndarray)) or (len(btracks) > 0 and isinstance(btracks[0], np.ndarray)):
|
||||
atlbrs = atracks
|
||||
btlbrs = btracks
|
||||
else:
|
||||
atlbrs = [track.tlbr for track in atracks]
|
||||
btlbrs = [track.tlbr for track in btracks]
|
||||
_ious = ious(atlbrs, btlbrs)
|
||||
cost_matrix = 1 - _ious
|
||||
|
||||
return cost_matrix
|
||||
|
||||
def v_iou_distance(atracks, btracks):
|
||||
"""
|
||||
Compute cost based on IoU
|
||||
:type atracks: list[STrack]
|
||||
:type btracks: list[STrack]
|
||||
|
||||
:rtype cost_matrix np.ndarray
|
||||
"""
|
||||
|
||||
if (len(atracks)>0 and isinstance(atracks[0], np.ndarray)) or (len(btracks) > 0 and isinstance(btracks[0], np.ndarray)):
|
||||
atlbrs = atracks
|
||||
btlbrs = btracks
|
||||
else:
|
||||
atlbrs = [track.tlwh_to_tlbr(track.pred_bbox) for track in atracks]
|
||||
btlbrs = [track.tlwh_to_tlbr(track.pred_bbox) for track in btracks]
|
||||
_ious = ious(atlbrs, btlbrs)
|
||||
cost_matrix = 1 - _ious
|
||||
|
||||
return cost_matrix
|
||||
|
||||
def embedding_distance(tracks, detections, metric='cosine'):
|
||||
"""
|
||||
:param tracks: list[STrack]
|
||||
:param detections: list[BaseTrack]
|
||||
:param metric:
|
||||
:return: cost_matrix np.ndarray
|
||||
"""
|
||||
|
||||
cost_matrix = np.zeros((len(tracks), len(detections)), dtype=np.float64)
|
||||
if cost_matrix.size == 0:
|
||||
return cost_matrix
|
||||
det_features = np.asarray([track.curr_feat for track in detections], dtype=np.float64)
|
||||
#for i, track in enumerate(tracks):
|
||||
#cost_matrix[i, :] = np.maximum(0.0, cdist(track.smooth_feat.reshape(1,-1), det_features, metric))
|
||||
track_features = np.asarray([track.smooth_feat for track in tracks], dtype=np.float64)
|
||||
cost_matrix = np.maximum(0.0, cdist(track_features, det_features, metric)) # Nomalized features
|
||||
return cost_matrix
|
||||
|
||||
|
||||
def fuse_motion(kf, cost_matrix, tracks, detections, only_position=False, lambda_=0.98):
|
||||
if cost_matrix.size == 0:
|
||||
return cost_matrix
|
||||
gating_dim = 2 if only_position else 4
|
||||
gating_threshold = chi2inv95[gating_dim]
|
||||
measurements = np.asarray([det.to_xyah() for det in detections])
|
||||
for row, track in enumerate(tracks):
|
||||
gating_distance = kf.gating_distance(
|
||||
track.mean, track.covariance, measurements, only_position, metric='maha')
|
||||
cost_matrix[row, gating_distance > gating_threshold] = np.inf
|
||||
cost_matrix[row] = lambda_ * cost_matrix[row] + (1 - lambda_) * gating_distance
|
||||
return cost_matrix
|
||||
|
||||
|
||||
def fuse_iou(cost_matrix, tracks, detections):
|
||||
if cost_matrix.size == 0:
|
||||
return cost_matrix
|
||||
reid_sim = 1 - cost_matrix
|
||||
iou_dist = iou_distance(tracks, detections)
|
||||
iou_sim = 1 - iou_dist
|
||||
fuse_sim = reid_sim * (1 + iou_sim) / 2
|
||||
det_scores = np.array([det.score for det in detections])
|
||||
det_scores = np.expand_dims(det_scores, axis=0).repeat(cost_matrix.shape[0], axis=0)
|
||||
#fuse_sim = fuse_sim * (1 + det_scores) / 2
|
||||
fuse_cost = 1 - fuse_sim
|
||||
return fuse_cost
|
||||
|
||||
|
||||
def fuse_score(cost_matrix, detections):
|
||||
if cost_matrix.size == 0:
|
||||
return cost_matrix
|
||||
iou_sim = 1 - cost_matrix
|
||||
det_scores = np.array([det.score for det in detections])
|
||||
det_scores = np.expand_dims(det_scores, axis=0).repeat(cost_matrix.shape[0], axis=0)
|
||||
fuse_sim = iou_sim * det_scores
|
||||
fuse_cost = 1 - fuse_sim
|
||||
return fuse_cost
|
||||
|
||||
|
||||
def greedy_assignment_iou(dist, thresh):
|
||||
matched_indices = []
|
||||
if dist.shape[1] == 0:
|
||||
return np.array(matched_indices, np.int32).reshape(-1, 2)
|
||||
for i in range(dist.shape[0]):
|
||||
j = dist[i].argmin()
|
||||
if dist[i][j] < thresh:
|
||||
dist[:, j] = 1.
|
||||
matched_indices.append([j, i])
|
||||
return np.array(matched_indices, np.int32).reshape(-1, 2)
|
||||
|
||||
def greedy_assignment(dists, threshs):
|
||||
matches = greedy_assignment_iou(dists.T, threshs)
|
||||
u_det = [d for d in range(dists.shape[1]) if not (d in matches[:, 1])]
|
||||
u_track = [d for d in range(dists.shape[0]) if not (d in matches[:, 0])]
|
||||
return matches, u_track, u_det
|
||||
|
||||
def fuse_score_matrix(cost_matrix, detections, tracks):
|
||||
if cost_matrix.size == 0:
|
||||
return cost_matrix
|
||||
iou_sim = 1 - cost_matrix
|
||||
|
||||
det_scores = np.array([det.score for det in detections])
|
||||
det_scores = np.expand_dims(det_scores, axis=0).repeat(cost_matrix.shape[0], axis=0)
|
||||
trk_scores = np.array([trk.score for trk in tracks])
|
||||
trk_scores = np.expand_dims(trk_scores, axis=1).repeat(cost_matrix.shape[1], axis=1)
|
||||
mid_scores = (det_scores + trk_scores) / 2
|
||||
fuse_sim = iou_sim * mid_scores
|
||||
fuse_cost = 1 - fuse_sim
|
||||
|
||||
return fuse_cost
|
||||
|
||||
"""
|
||||
calculate buffered IoU, used in C_BIoU_Tracker
|
||||
"""
|
||||
def buffered_iou_distance(atracks, btracks, level=1):
|
||||
"""
|
||||
atracks: list[C_BIoUSTrack], tracks
|
||||
btracks: list[C_BIoUSTrack], detections
|
||||
level: cascade level, 1 or 2
|
||||
"""
|
||||
assert level in [1, 2], 'level must be 1 or 2'
|
||||
if level == 1: # use motion_state1(tracks) and buffer_bbox1(detections) to calculate
|
||||
atlbrs = [track.tlwh_to_tlbr(track.motion_state1) for track in atracks]
|
||||
btlbrs = [det.tlwh_to_tlbr(det.buffer_bbox1) for det in btracks]
|
||||
else:
|
||||
atlbrs = [track.tlwh_to_tlbr(track.motion_state2) for track in atracks]
|
||||
btlbrs = [det.tlwh_to_tlbr(det.buffer_bbox2) for det in btracks]
|
||||
_ious = ious(atlbrs, btlbrs)
|
||||
|
||||
cost_matrix = 1 - _ious
|
||||
return cost_matrix
|
||||
|
||||
"""
|
||||
observation centric association, with velocity, for OC Sort
|
||||
"""
|
||||
def observation_centric_association(tracklets, detections, iou_threshold, velocities, previous_obs, vdc_weight):
|
||||
|
||||
if(len(tracklets) == 0):
|
||||
return np.empty((0, 2), dtype=int), tuple(range(len(tracklets))), tuple(range(len(detections)))
|
||||
|
||||
# get numpy format bboxes
|
||||
trk_tlbrs = np.array([track.tlbr for track in tracklets])
|
||||
det_tlbrs = np.array([det.tlbr for det in detections])
|
||||
det_scores = np.array([det.score for det in detections])
|
||||
|
||||
iou_matrix = bbox_ious(trk_tlbrs, det_tlbrs)
|
||||
|
||||
Y, X = speed_direction_batch(det_tlbrs, previous_obs)
|
||||
inertia_Y, inertia_X = velocities[:,0], velocities[:,1]
|
||||
inertia_Y = np.repeat(inertia_Y[:, np.newaxis], Y.shape[1], axis=1)
|
||||
inertia_X = np.repeat(inertia_X[:, np.newaxis], X.shape[1], axis=1)
|
||||
diff_angle_cos = inertia_X * X + inertia_Y * Y
|
||||
diff_angle_cos = np.clip(diff_angle_cos, a_min=-1, a_max=1)
|
||||
diff_angle = np.arccos(diff_angle_cos)
|
||||
diff_angle = (np.pi / 2.0 - np.abs(diff_angle)) / np.pi
|
||||
|
||||
valid_mask = np.ones(previous_obs.shape[0])
|
||||
valid_mask[np.where(previous_obs[:, 4] < 0)] = 0
|
||||
|
||||
scores = np.repeat(det_scores[:, np.newaxis], trk_tlbrs.shape[0], axis=1)
|
||||
valid_mask = np.repeat(valid_mask[:, np.newaxis], X.shape[1], axis=1)
|
||||
|
||||
angle_diff_cost = (valid_mask * diff_angle) * vdc_weight
|
||||
angle_diff_cost = angle_diff_cost * scores.T
|
||||
|
||||
matches, unmatched_a, unmatched_b = linear_assignment(- (iou_matrix + angle_diff_cost), thresh=0.9)
|
||||
|
||||
|
||||
return matches, unmatched_a, unmatched_b
|
||||
|
||||
"""
|
||||
helper func of observation_centric_association
|
||||
"""
|
||||
def speed_direction_batch(dets, tracks):
|
||||
tracks = tracks[..., np.newaxis]
|
||||
CX1, CY1 = (dets[:, 0] + dets[:, 2]) / 2.0, (dets[:,1] + dets[:,3]) / 2.0
|
||||
CX2, CY2 = (tracks[:, 0] + tracks[:, 2]) / 2.0, (tracks[:, 1] + tracks[:, 3]) / 2.0
|
||||
dx = CX2 - CX1
|
||||
dy = CY2 - CY1
|
||||
norm = np.sqrt(dx**2 + dy**2) + 1e-6
|
||||
dx = dx / norm
|
||||
dy = dy / norm
|
||||
return dy, dx # size: num_track x num_det
|
||||
|
||||
|
||||
def matching_cascade(
|
||||
distance_metric, matching_thresh, cascade_depth, tracks, detections,
|
||||
track_indices=None, detection_indices=None):
|
||||
"""
|
||||
Run matching cascade in DeepSORT
|
||||
|
||||
distance_metirc: function that calculate the cost matrix
|
||||
matching_thresh: float, Associations with cost larger than this value are disregarded.
|
||||
cascade_path: int, equal to max_age of a tracklet
|
||||
tracks: List[STrack], current tracks
|
||||
detections: List[STrack], current detections
|
||||
track_indices: List[int], tracks that will be calculated, Default None
|
||||
detection_indices: List[int], detections that will be calculated, Default None
|
||||
|
||||
return:
|
||||
matched pair, unmatched tracks, unmatced detections: List[int], List[int], List[int]
|
||||
"""
|
||||
if track_indices is None:
|
||||
track_indices = list(range(len(tracks)))
|
||||
if detection_indices is None:
|
||||
detection_indices = list(range(len(detections)))
|
||||
|
||||
detections_to_match = detection_indices
|
||||
matches = []
|
||||
|
||||
for level in range(cascade_depth):
|
||||
"""
|
||||
match new track with detection firstly
|
||||
"""
|
||||
if not len(detections_to_match): # No detections left
|
||||
break
|
||||
|
||||
track_indices_l = [
|
||||
k for k in track_indices
|
||||
if tracks[k].time_since_update == 1 + level
|
||||
] # filter tracks whose age is equal to level + 1 (The age of Newest track = 1)
|
||||
|
||||
if not len(track_indices_l): # Nothing to match at this level
|
||||
continue
|
||||
|
||||
# tracks and detections which will be mathcted in current level
|
||||
track_l = [tracks[idx] for idx in track_indices_l] # List[STrack]
|
||||
det_l = [detections[idx] for idx in detections_to_match] # List[STrack]
|
||||
|
||||
# calculate the cost matrix
|
||||
cost_matrix = distance_metric(track_l, det_l)
|
||||
|
||||
# solve the linear assignment problem
|
||||
matched_row_col, umatched_row, umatched_col = \
|
||||
linear_assignment(cost_matrix, matching_thresh)
|
||||
|
||||
for row, col in matched_row_col: # for those who matched
|
||||
matches.append((track_indices_l[row], detections_to_match[col]))
|
||||
|
||||
umatched_detecion_l = [] # current detections not matched
|
||||
for col in umatched_col: # for detections not matched
|
||||
umatched_detecion_l.append(detections_to_match[col])
|
||||
|
||||
detections_to_match = umatched_detecion_l # update detections to match for next level
|
||||
unmatched_tracks = list(set(track_indices) - set(k for k, _ in matches))
|
||||
|
||||
return matches, unmatched_tracks, detections_to_match
|
||||
|
||||
def nearest_embedding_distance(tracks, detections, metric='cosine'):
|
||||
"""
|
||||
different from embedding distance, this func calculate the
|
||||
nearest distance among all track history features and detections
|
||||
|
||||
tracks: list[STrack]
|
||||
detections: list[STrack]
|
||||
metric: str, cosine or euclidean
|
||||
TODO: support euclidean distance
|
||||
|
||||
return:
|
||||
cost_matrix, np.ndarray, shape(len(tracks), len(detections))
|
||||
"""
|
||||
cost_matrix = np.zeros((len(tracks), len(detections)))
|
||||
det_features = np.asarray([det.features[-1] for det in detections])
|
||||
|
||||
for row, track in enumerate(tracks):
|
||||
track_history_features = np.asarray(track.features)
|
||||
dist = 1. - cal_cosine_distance(track_history_features, det_features)
|
||||
dist = dist.min(axis=0)
|
||||
cost_matrix[row, :] = dist
|
||||
|
||||
return cost_matrix
|
||||
|
||||
def cal_cosine_distance(mat1, mat2):
|
||||
"""
|
||||
simple func to calculate cosine distance between 2 matrixs
|
||||
|
||||
:param mat1: np.ndarray, shape(M, dim)
|
||||
:param mat2: np.ndarray, shape(N, dim)
|
||||
:return: np.ndarray, shape(M, N)
|
||||
"""
|
||||
# result = mat1·mat2^T / |mat1|·|mat2|
|
||||
# norm mat1 and mat2
|
||||
mat1 = mat1 / np.linalg.norm(mat1, axis=1, keepdims=True)
|
||||
mat2 = mat2 / np.linalg.norm(mat2, axis=1, keepdims=True)
|
||||
|
||||
return np.dot(mat1, mat2.T)
|
237
yolov7-tracker-example/tracker/trackers/ocsort_tracker.py
Normal file
237
yolov7-tracker-example/tracker/trackers/ocsort_tracker.py
Normal file
@ -0,0 +1,237 @@
|
||||
"""
|
||||
OC Sort
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
from collections import deque
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .tracklet import Tracklet, Tracklet_w_velocity
|
||||
from .matching import *
|
||||
|
||||
from cython_bbox import bbox_overlaps as bbox_ious
|
||||
|
||||
class OCSortTracker(object):
|
||||
def __init__(self, args, frame_rate=30):
|
||||
self.tracked_tracklets = [] # type: list[Tracklet]
|
||||
self.lost_tracklets = [] # type: list[Tracklet]
|
||||
self.removed_tracklets = [] # type: list[Tracklet]
|
||||
|
||||
self.frame_id = 0
|
||||
self.args = args
|
||||
|
||||
self.det_thresh = args.conf_thresh + 0.1
|
||||
self.buffer_size = int(frame_rate / 30.0 * args.track_buffer)
|
||||
self.max_time_lost = self.buffer_size
|
||||
|
||||
self.motion = args.kalman_format
|
||||
|
||||
self.delta_t = 3
|
||||
|
||||
@staticmethod
|
||||
def k_previous_obs(observations, cur_age, k):
|
||||
if len(observations) == 0:
|
||||
return [-1, -1, -1, -1, -1]
|
||||
for i in range(k):
|
||||
dt = k - i
|
||||
if cur_age - dt in observations:
|
||||
return observations[cur_age-dt]
|
||||
max_age = max(observations.keys())
|
||||
return observations[max_age]
|
||||
|
||||
def update(self, output_results, img, ori_img):
|
||||
"""
|
||||
output_results: processed detections (scale to original size) tlbr format
|
||||
"""
|
||||
|
||||
self.frame_id += 1
|
||||
activated_tracklets = []
|
||||
refind_tracklets = []
|
||||
lost_tracklets = []
|
||||
removed_tracklets = []
|
||||
|
||||
scores = output_results[:, 4]
|
||||
bboxes = output_results[:, :4]
|
||||
categories = output_results[:, -1]
|
||||
|
||||
remain_inds = scores > self.args.conf_thresh
|
||||
inds_low = scores > 0.1
|
||||
inds_high = scores < self.args.conf_thresh
|
||||
|
||||
inds_second = np.logical_and(inds_low, inds_high)
|
||||
dets_second = bboxes[inds_second]
|
||||
dets = bboxes[remain_inds]
|
||||
|
||||
cates = categories[remain_inds]
|
||||
cates_second = categories[inds_second]
|
||||
|
||||
scores_keep = scores[remain_inds]
|
||||
scores_second = scores[inds_second]
|
||||
|
||||
if len(dets) > 0:
|
||||
'''Detections'''
|
||||
detections = [Tracklet_w_velocity(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets, scores_keep, cates)]
|
||||
else:
|
||||
detections = []
|
||||
|
||||
''' Add newly detected tracklets to tracked_tracklets'''
|
||||
unconfirmed = []
|
||||
tracked_tracklets = [] # type: list[Tracklet]
|
||||
for track in self.tracked_tracklets:
|
||||
if not track.is_activated:
|
||||
unconfirmed.append(track)
|
||||
else:
|
||||
tracked_tracklets.append(track)
|
||||
|
||||
''' Step 2: First association, Observation Centric Momentum'''
|
||||
tracklet_pool = joint_tracklets(tracked_tracklets, self.lost_tracklets)
|
||||
|
||||
velocities = np.array(
|
||||
[trk.velocity if trk.velocity is not None else np.array((0, 0)) for trk in tracklet_pool])
|
||||
|
||||
# last observation, obervation-centric
|
||||
# last_boxes = np.array([trk.last_observation for trk in tracklet_pool])
|
||||
|
||||
# historical observations
|
||||
k_observations = np.array(
|
||||
[self.k_previous_obs(trk.observations, trk.age, self.delta_t) for trk in tracklet_pool])
|
||||
|
||||
|
||||
# Predict the current location with Kalman
|
||||
for tracklet in tracklet_pool:
|
||||
tracklet.predict()
|
||||
|
||||
# Observation centric cost matrix and assignment
|
||||
matches, u_track, u_detection = observation_centric_association(
|
||||
tracklets=tracklet_pool, detections=detections, iou_threshold=0.3,
|
||||
velocities=velocities, previous_obs=k_observations, vdc_weight=0.2
|
||||
)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_pool[itracked]
|
||||
det = detections[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
''' Step 3: Second association, with low score detection boxes'''
|
||||
# association the untrack to the low score detections
|
||||
if len(dets_second) > 0:
|
||||
'''Detections'''
|
||||
detections_second = [Tracklet_w_velocity(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets_second, scores_second, cates_second)]
|
||||
else:
|
||||
detections_second = []
|
||||
r_tracked_tracklets = [tracklet_pool[i] for i in u_track if tracklet_pool[i].state == TrackState.Tracked]
|
||||
|
||||
# for unmatched tracks in the first round, use last obervation
|
||||
r_tracked_tracklets_last_observ = [tracklet_pool[i].last_observation[:4] for i in u_track if tracklet_pool[i].state == TrackState.Tracked]
|
||||
detections_second_bbox = [det.tlbr for det in detections_second]
|
||||
|
||||
dists = 1. - ious(r_tracked_tracklets_last_observ, detections_second_bbox)
|
||||
|
||||
matches, u_track, u_detection_second = linear_assignment(dists, thresh=0.5)
|
||||
for itracked, idet in matches:
|
||||
track = r_tracked_tracklets[itracked]
|
||||
det = detections_second[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(det, self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
for it in u_track:
|
||||
track = r_tracked_tracklets[it]
|
||||
if not track.state == TrackState.Lost:
|
||||
track.mark_lost()
|
||||
lost_tracklets.append(track)
|
||||
|
||||
'''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
|
||||
detections = [detections[i] for i in u_detection]
|
||||
dists = iou_distance(unconfirmed, detections)
|
||||
|
||||
matches, u_unconfirmed, u_detection = linear_assignment(dists, thresh=0.7)
|
||||
|
||||
for itracked, idet in matches:
|
||||
unconfirmed[itracked].update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(unconfirmed[itracked])
|
||||
for it in u_unconfirmed:
|
||||
track = unconfirmed[it]
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
""" Step 4: Init new tracklets"""
|
||||
for inew in u_detection:
|
||||
track = detections[inew]
|
||||
if track.score < self.det_thresh:
|
||||
continue
|
||||
track.activate(self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
|
||||
""" Step 5: Update state"""
|
||||
for track in self.lost_tracklets:
|
||||
if self.frame_id - track.end_frame > self.max_time_lost:
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
# print('Ramained match {} s'.format(t4-t3))
|
||||
|
||||
self.tracked_tracklets = [t for t in self.tracked_tracklets if t.state == TrackState.Tracked]
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, activated_tracklets)
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, refind_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.tracked_tracklets)
|
||||
self.lost_tracklets.extend(lost_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.removed_tracklets)
|
||||
self.removed_tracklets.extend(removed_tracklets)
|
||||
self.tracked_tracklets, self.lost_tracklets = remove_duplicate_tracklets(self.tracked_tracklets, self.lost_tracklets)
|
||||
# get scores of lost tracks
|
||||
output_tracklets = [track for track in self.tracked_tracklets if track.is_activated]
|
||||
|
||||
return output_tracklets
|
||||
|
||||
|
||||
|
||||
|
||||
def joint_tracklets(tlista, tlistb):
|
||||
exists = {}
|
||||
res = []
|
||||
for t in tlista:
|
||||
exists[t.track_id] = 1
|
||||
res.append(t)
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if not exists.get(tid, 0):
|
||||
exists[tid] = 1
|
||||
res.append(t)
|
||||
return res
|
||||
|
||||
|
||||
def sub_tracklets(tlista, tlistb):
|
||||
tracklets = {}
|
||||
for t in tlista:
|
||||
tracklets[t.track_id] = t
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if tracklets.get(tid, 0):
|
||||
del tracklets[tid]
|
||||
return list(tracklets.values())
|
||||
|
||||
|
||||
def remove_duplicate_tracklets(trackletsa, trackletsb):
|
||||
pdist = iou_distance(trackletsa, trackletsb)
|
||||
pairs = np.where(pdist < 0.15)
|
||||
dupa, dupb = list(), list()
|
||||
for p, q in zip(*pairs):
|
||||
timep = trackletsa[p].frame_id - trackletsa[p].start_frame
|
||||
timeq = trackletsb[q].frame_id - trackletsb[q].start_frame
|
||||
if timep > timeq:
|
||||
dupb.append(q)
|
||||
else:
|
||||
dupa.append(p)
|
||||
resa = [t for i, t in enumerate(trackletsa) if not i in dupa]
|
||||
resb = [t for i, t in enumerate(trackletsb) if not i in dupb]
|
||||
return resa, resb
|
@ -0,0 +1,98 @@
|
||||
"""
|
||||
AFLink code in StrongSORT(StrongSORT: Make DeepSORT Great Again(arxiv))
|
||||
|
||||
copied from origin repo
|
||||
"""
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
import numpy as np
|
||||
import cv2
|
||||
import logging
|
||||
import torchvision.transforms as transforms
|
||||
|
||||
|
||||
class TemporalBlock(nn.Module):
|
||||
def __init__(self, cin, cout):
|
||||
super(TemporalBlock, self).__init__()
|
||||
self.conv = nn.Conv2d(cin, cout, (7, 1), bias=False)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.bnf = nn.BatchNorm1d(cout)
|
||||
self.bnx = nn.BatchNorm1d(cout)
|
||||
self.bny = nn.BatchNorm1d(cout)
|
||||
|
||||
def bn(self, x):
|
||||
x[:, :, :, 0] = self.bnf(x[:, :, :, 0])
|
||||
x[:, :, :, 1] = self.bnx(x[:, :, :, 1])
|
||||
x[:, :, :, 2] = self.bny(x[:, :, :, 2])
|
||||
return x
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
x = self.bn(x)
|
||||
x = self.relu(x)
|
||||
return x
|
||||
|
||||
|
||||
class FusionBlock(nn.Module):
|
||||
def __init__(self, cin, cout):
|
||||
super(FusionBlock, self).__init__()
|
||||
self.conv = nn.Conv2d(cin, cout, (1, 3), bias=False)
|
||||
self.bn = nn.BatchNorm2d(cout)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
x = self.bn(x)
|
||||
x = self.relu(x)
|
||||
return x
|
||||
|
||||
|
||||
class Classifier(nn.Module):
|
||||
def __init__(self, cin):
|
||||
super(Classifier, self).__init__()
|
||||
self.fc1 = nn.Linear(cin*2, cin//2)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.fc2 = nn.Linear(cin//2, 2)
|
||||
|
||||
def forward(self, x1, x2):
|
||||
x = torch.cat((x1, x2), dim=1)
|
||||
x = self.fc1(x)
|
||||
x = self.relu(x)
|
||||
x = self.fc2(x)
|
||||
return x
|
||||
|
||||
|
||||
class PostLinker(nn.Module):
|
||||
def __init__(self):
|
||||
super(PostLinker, self).__init__()
|
||||
self.TemporalModule_1 = nn.Sequential(
|
||||
TemporalBlock(1, 32),
|
||||
TemporalBlock(32, 64),
|
||||
TemporalBlock(64, 128),
|
||||
TemporalBlock(128, 256)
|
||||
)
|
||||
self.TemporalModule_2 = nn.Sequential(
|
||||
TemporalBlock(1, 32),
|
||||
TemporalBlock(32, 64),
|
||||
TemporalBlock(64, 128),
|
||||
TemporalBlock(128, 256)
|
||||
)
|
||||
self.FusionBlock_1 = FusionBlock(256, 256)
|
||||
self.FusionBlock_2 = FusionBlock(256, 256)
|
||||
self.pooling = nn.AdaptiveAvgPool2d((1, 1))
|
||||
self.classifier = Classifier(256)
|
||||
|
||||
def forward(self, x1, x2):
|
||||
x1 = x1[:, :, :, :3]
|
||||
x2 = x2[:, :, :, :3]
|
||||
x1 = self.TemporalModule_1(x1) # [B,1,30,3] -> [B,256,6,3]
|
||||
x2 = self.TemporalModule_2(x2)
|
||||
x1 = self.FusionBlock_1(x1)
|
||||
x2 = self.FusionBlock_2(x2)
|
||||
x1 = self.pooling(x1).squeeze(-1).squeeze(-1)
|
||||
x2 = self.pooling(x2).squeeze(-1).squeeze(-1)
|
||||
y = self.classifier(x1, x2)
|
||||
if not self.training:
|
||||
y = torch.softmax(y, dim=1)
|
||||
return y
|
598
yolov7-tracker-example/tracker/trackers/reid_models/OSNet.py
Normal file
598
yolov7-tracker-example/tracker/trackers/reid_models/OSNet.py
Normal file
@ -0,0 +1,598 @@
|
||||
from __future__ import division, absolute_import
|
||||
import warnings
|
||||
import torch
|
||||
from torch import nn
|
||||
from torch.nn import functional as F
|
||||
|
||||
__all__ = [
|
||||
'osnet_x1_0', 'osnet_x0_75', 'osnet_x0_5', 'osnet_x0_25', 'osnet_ibn_x1_0'
|
||||
]
|
||||
|
||||
pretrained_urls = {
|
||||
'osnet_x1_0':
|
||||
'https://drive.google.com/uc?id=1LaG1EJpHrxdAxKnSCJ_i0u-nbxSAeiFY',
|
||||
'osnet_x0_75':
|
||||
'https://drive.google.com/uc?id=1uwA9fElHOk3ZogwbeY5GkLI6QPTX70Hq',
|
||||
'osnet_x0_5':
|
||||
'https://drive.google.com/uc?id=16DGLbZukvVYgINws8u8deSaOqjybZ83i',
|
||||
'osnet_x0_25':
|
||||
'https://drive.google.com/uc?id=1rb8UN5ZzPKRc_xvtHlyDh-cSz88YX9hs',
|
||||
'osnet_ibn_x1_0':
|
||||
'https://drive.google.com/uc?id=1sr90V6irlYYDd4_4ISU2iruoRG8J__6l'
|
||||
}
|
||||
|
||||
|
||||
##########
|
||||
# Basic layers
|
||||
##########
|
||||
class ConvLayer(nn.Module):
|
||||
"""Convolution layer (conv + bn + relu)."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
in_channels,
|
||||
out_channels,
|
||||
kernel_size,
|
||||
stride=1,
|
||||
padding=0,
|
||||
groups=1,
|
||||
IN=False
|
||||
):
|
||||
super(ConvLayer, self).__init__()
|
||||
self.conv = nn.Conv2d(
|
||||
in_channels,
|
||||
out_channels,
|
||||
kernel_size,
|
||||
stride=stride,
|
||||
padding=padding,
|
||||
bias=False,
|
||||
groups=groups
|
||||
)
|
||||
if IN:
|
||||
self.bn = nn.InstanceNorm2d(out_channels, affine=True)
|
||||
else:
|
||||
self.bn = nn.BatchNorm2d(out_channels)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
x = self.bn(x)
|
||||
x = self.relu(x)
|
||||
return x
|
||||
|
||||
|
||||
class Conv1x1(nn.Module):
|
||||
"""1x1 convolution + bn + relu."""
|
||||
|
||||
def __init__(self, in_channels, out_channels, stride=1, groups=1):
|
||||
super(Conv1x1, self).__init__()
|
||||
self.conv = nn.Conv2d(
|
||||
in_channels,
|
||||
out_channels,
|
||||
1,
|
||||
stride=stride,
|
||||
padding=0,
|
||||
bias=False,
|
||||
groups=groups
|
||||
)
|
||||
self.bn = nn.BatchNorm2d(out_channels)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
x = self.bn(x)
|
||||
x = self.relu(x)
|
||||
return x
|
||||
|
||||
|
||||
class Conv1x1Linear(nn.Module):
|
||||
"""1x1 convolution + bn (w/o non-linearity)."""
|
||||
|
||||
def __init__(self, in_channels, out_channels, stride=1):
|
||||
super(Conv1x1Linear, self).__init__()
|
||||
self.conv = nn.Conv2d(
|
||||
in_channels, out_channels, 1, stride=stride, padding=0, bias=False
|
||||
)
|
||||
self.bn = nn.BatchNorm2d(out_channels)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
x = self.bn(x)
|
||||
return x
|
||||
|
||||
|
||||
class Conv3x3(nn.Module):
|
||||
"""3x3 convolution + bn + relu."""
|
||||
|
||||
def __init__(self, in_channels, out_channels, stride=1, groups=1):
|
||||
super(Conv3x3, self).__init__()
|
||||
self.conv = nn.Conv2d(
|
||||
in_channels,
|
||||
out_channels,
|
||||
3,
|
||||
stride=stride,
|
||||
padding=1,
|
||||
bias=False,
|
||||
groups=groups
|
||||
)
|
||||
self.bn = nn.BatchNorm2d(out_channels)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
x = self.bn(x)
|
||||
x = self.relu(x)
|
||||
return x
|
||||
|
||||
|
||||
class LightConv3x3(nn.Module):
|
||||
"""Lightweight 3x3 convolution.
|
||||
|
||||
1x1 (linear) + dw 3x3 (nonlinear).
|
||||
"""
|
||||
|
||||
def __init__(self, in_channels, out_channels):
|
||||
super(LightConv3x3, self).__init__()
|
||||
self.conv1 = nn.Conv2d(
|
||||
in_channels, out_channels, 1, stride=1, padding=0, bias=False
|
||||
)
|
||||
self.conv2 = nn.Conv2d(
|
||||
out_channels,
|
||||
out_channels,
|
||||
3,
|
||||
stride=1,
|
||||
padding=1,
|
||||
bias=False,
|
||||
groups=out_channels
|
||||
)
|
||||
self.bn = nn.BatchNorm2d(out_channels)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv1(x)
|
||||
x = self.conv2(x)
|
||||
x = self.bn(x)
|
||||
x = self.relu(x)
|
||||
return x
|
||||
|
||||
|
||||
##########
|
||||
# Building blocks for omni-scale feature learning
|
||||
##########
|
||||
class ChannelGate(nn.Module):
|
||||
"""A mini-network that generates channel-wise gates conditioned on input tensor."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
in_channels,
|
||||
num_gates=None,
|
||||
return_gates=False,
|
||||
gate_activation='sigmoid',
|
||||
reduction=16,
|
||||
layer_norm=False
|
||||
):
|
||||
super(ChannelGate, self).__init__()
|
||||
if num_gates is None:
|
||||
num_gates = in_channels
|
||||
self.return_gates = return_gates
|
||||
self.global_avgpool = nn.AdaptiveAvgPool2d(1)
|
||||
self.fc1 = nn.Conv2d(
|
||||
in_channels,
|
||||
in_channels // reduction,
|
||||
kernel_size=1,
|
||||
bias=True,
|
||||
padding=0
|
||||
)
|
||||
self.norm1 = None
|
||||
if layer_norm:
|
||||
self.norm1 = nn.LayerNorm((in_channels // reduction, 1, 1))
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.fc2 = nn.Conv2d(
|
||||
in_channels // reduction,
|
||||
num_gates,
|
||||
kernel_size=1,
|
||||
bias=True,
|
||||
padding=0
|
||||
)
|
||||
if gate_activation == 'sigmoid':
|
||||
self.gate_activation = nn.Sigmoid()
|
||||
elif gate_activation == 'relu':
|
||||
self.gate_activation = nn.ReLU(inplace=True)
|
||||
elif gate_activation == 'linear':
|
||||
self.gate_activation = None
|
||||
else:
|
||||
raise RuntimeError(
|
||||
"Unknown gate activation: {}".format(gate_activation)
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
input = x
|
||||
x = self.global_avgpool(x)
|
||||
x = self.fc1(x)
|
||||
if self.norm1 is not None:
|
||||
x = self.norm1(x)
|
||||
x = self.relu(x)
|
||||
x = self.fc2(x)
|
||||
if self.gate_activation is not None:
|
||||
x = self.gate_activation(x)
|
||||
if self.return_gates:
|
||||
return x
|
||||
return input * x
|
||||
|
||||
|
||||
class OSBlock(nn.Module):
|
||||
"""Omni-scale feature learning block."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
in_channels,
|
||||
out_channels,
|
||||
IN=False,
|
||||
bottleneck_reduction=4,
|
||||
**kwargs
|
||||
):
|
||||
super(OSBlock, self).__init__()
|
||||
mid_channels = out_channels // bottleneck_reduction
|
||||
self.conv1 = Conv1x1(in_channels, mid_channels)
|
||||
self.conv2a = LightConv3x3(mid_channels, mid_channels)
|
||||
self.conv2b = nn.Sequential(
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
)
|
||||
self.conv2c = nn.Sequential(
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
)
|
||||
self.conv2d = nn.Sequential(
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
LightConv3x3(mid_channels, mid_channels),
|
||||
)
|
||||
self.gate = ChannelGate(mid_channels)
|
||||
self.conv3 = Conv1x1Linear(mid_channels, out_channels)
|
||||
self.downsample = None
|
||||
if in_channels != out_channels:
|
||||
self.downsample = Conv1x1Linear(in_channels, out_channels)
|
||||
self.IN = None
|
||||
if IN:
|
||||
self.IN = nn.InstanceNorm2d(out_channels, affine=True)
|
||||
|
||||
def forward(self, x):
|
||||
identity = x
|
||||
x1 = self.conv1(x)
|
||||
x2a = self.conv2a(x1)
|
||||
x2b = self.conv2b(x1)
|
||||
x2c = self.conv2c(x1)
|
||||
x2d = self.conv2d(x1)
|
||||
x2 = self.gate(x2a) + self.gate(x2b) + self.gate(x2c) + self.gate(x2d)
|
||||
x3 = self.conv3(x2)
|
||||
if self.downsample is not None:
|
||||
identity = self.downsample(identity)
|
||||
out = x3 + identity
|
||||
if self.IN is not None:
|
||||
out = self.IN(out)
|
||||
return F.relu(out)
|
||||
|
||||
|
||||
##########
|
||||
# Network architecture
|
||||
##########
|
||||
class OSNet(nn.Module):
|
||||
"""Omni-Scale Network.
|
||||
|
||||
Reference:
|
||||
- Zhou et al. Omni-Scale Feature Learning for Person Re-Identification. ICCV, 2019.
|
||||
- Zhou et al. Learning Generalisable Omni-Scale Representations
|
||||
for Person Re-Identification. TPAMI, 2021.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
num_classes,
|
||||
blocks,
|
||||
layers,
|
||||
channels,
|
||||
feature_dim=512,
|
||||
loss='softmax',
|
||||
IN=False,
|
||||
**kwargs
|
||||
):
|
||||
super(OSNet, self).__init__()
|
||||
num_blocks = len(blocks)
|
||||
assert num_blocks == len(layers)
|
||||
assert num_blocks == len(channels) - 1
|
||||
self.loss = loss
|
||||
self.feature_dim = feature_dim
|
||||
|
||||
# convolutional backbone
|
||||
self.conv1 = ConvLayer(3, channels[0], 7, stride=2, padding=3, IN=IN)
|
||||
self.maxpool = nn.MaxPool2d(3, stride=2, padding=1)
|
||||
self.conv2 = self._make_layer(
|
||||
blocks[0],
|
||||
layers[0],
|
||||
channels[0],
|
||||
channels[1],
|
||||
reduce_spatial_size=True,
|
||||
IN=IN
|
||||
)
|
||||
self.conv3 = self._make_layer(
|
||||
blocks[1],
|
||||
layers[1],
|
||||
channels[1],
|
||||
channels[2],
|
||||
reduce_spatial_size=True
|
||||
)
|
||||
self.conv4 = self._make_layer(
|
||||
blocks[2],
|
||||
layers[2],
|
||||
channels[2],
|
||||
channels[3],
|
||||
reduce_spatial_size=False
|
||||
)
|
||||
self.conv5 = Conv1x1(channels[3], channels[3])
|
||||
self.global_avgpool = nn.AdaptiveAvgPool2d(1)
|
||||
# fully connected layer
|
||||
self.fc = self._construct_fc_layer(
|
||||
self.feature_dim, channels[3], dropout_p=None
|
||||
)
|
||||
# identity classification layer
|
||||
self.classifier = nn.Linear(self.feature_dim, num_classes)
|
||||
|
||||
self._init_params()
|
||||
|
||||
def _make_layer(
|
||||
self,
|
||||
block,
|
||||
layer,
|
||||
in_channels,
|
||||
out_channels,
|
||||
reduce_spatial_size,
|
||||
IN=False
|
||||
):
|
||||
layers = []
|
||||
|
||||
layers.append(block(in_channels, out_channels, IN=IN))
|
||||
for i in range(1, layer):
|
||||
layers.append(block(out_channels, out_channels, IN=IN))
|
||||
|
||||
if reduce_spatial_size:
|
||||
layers.append(
|
||||
nn.Sequential(
|
||||
Conv1x1(out_channels, out_channels),
|
||||
nn.AvgPool2d(2, stride=2)
|
||||
)
|
||||
)
|
||||
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def _construct_fc_layer(self, fc_dims, input_dim, dropout_p=None):
|
||||
if fc_dims is None or fc_dims < 0:
|
||||
self.feature_dim = input_dim
|
||||
return None
|
||||
|
||||
if isinstance(fc_dims, int):
|
||||
fc_dims = [fc_dims]
|
||||
|
||||
layers = []
|
||||
for dim in fc_dims:
|
||||
layers.append(nn.Linear(input_dim, dim))
|
||||
layers.append(nn.BatchNorm1d(dim))
|
||||
layers.append(nn.ReLU(inplace=True))
|
||||
if dropout_p is not None:
|
||||
layers.append(nn.Dropout(p=dropout_p))
|
||||
input_dim = dim
|
||||
|
||||
self.feature_dim = fc_dims[-1]
|
||||
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def _init_params(self):
|
||||
for m in self.modules():
|
||||
if isinstance(m, nn.Conv2d):
|
||||
nn.init.kaiming_normal_(
|
||||
m.weight, mode='fan_out', nonlinearity='relu'
|
||||
)
|
||||
if m.bias is not None:
|
||||
nn.init.constant_(m.bias, 0)
|
||||
|
||||
elif isinstance(m, nn.BatchNorm2d):
|
||||
nn.init.constant_(m.weight, 1)
|
||||
nn.init.constant_(m.bias, 0)
|
||||
|
||||
elif isinstance(m, nn.BatchNorm1d):
|
||||
nn.init.constant_(m.weight, 1)
|
||||
nn.init.constant_(m.bias, 0)
|
||||
|
||||
elif isinstance(m, nn.Linear):
|
||||
nn.init.normal_(m.weight, 0, 0.01)
|
||||
if m.bias is not None:
|
||||
nn.init.constant_(m.bias, 0)
|
||||
|
||||
def featuremaps(self, x):
|
||||
x = self.conv1(x)
|
||||
x = self.maxpool(x)
|
||||
x = self.conv2(x)
|
||||
x = self.conv3(x)
|
||||
x = self.conv4(x)
|
||||
x = self.conv5(x)
|
||||
return x
|
||||
|
||||
def forward(self, x, return_featuremaps=False):
|
||||
x = self.featuremaps(x)
|
||||
if return_featuremaps:
|
||||
return x
|
||||
v = self.global_avgpool(x)
|
||||
v = v.view(v.size(0), -1)
|
||||
if self.fc is not None:
|
||||
v = self.fc(v)
|
||||
if not self.training:
|
||||
return v
|
||||
y = self.classifier(v)
|
||||
if self.loss == 'softmax':
|
||||
return y
|
||||
elif self.loss == 'triplet':
|
||||
return y, v
|
||||
else:
|
||||
raise KeyError("Unsupported loss: {}".format(self.loss))
|
||||
|
||||
|
||||
def init_pretrained_weights(model, key=''):
|
||||
"""Initializes model with pretrained weights.
|
||||
|
||||
Layers that don't match with pretrained layers in name or size are kept unchanged.
|
||||
"""
|
||||
import os
|
||||
import errno
|
||||
import gdown
|
||||
from collections import OrderedDict
|
||||
|
||||
def _get_torch_home():
|
||||
ENV_TORCH_HOME = 'TORCH_HOME'
|
||||
ENV_XDG_CACHE_HOME = 'XDG_CACHE_HOME'
|
||||
DEFAULT_CACHE_DIR = '~/.cache'
|
||||
torch_home = os.path.expanduser(
|
||||
os.getenv(
|
||||
ENV_TORCH_HOME,
|
||||
os.path.join(
|
||||
os.getenv(ENV_XDG_CACHE_HOME, DEFAULT_CACHE_DIR), 'torch'
|
||||
)
|
||||
)
|
||||
)
|
||||
return torch_home
|
||||
|
||||
torch_home = _get_torch_home()
|
||||
model_dir = os.path.join(torch_home, 'checkpoints')
|
||||
try:
|
||||
os.makedirs(model_dir)
|
||||
except OSError as e:
|
||||
if e.errno == errno.EEXIST:
|
||||
# Directory already exists, ignore.
|
||||
pass
|
||||
else:
|
||||
# Unexpected OSError, re-raise.
|
||||
raise
|
||||
filename = key + '_imagenet.pth'
|
||||
cached_file = os.path.join(model_dir, filename)
|
||||
|
||||
if not os.path.exists(cached_file):
|
||||
gdown.download(pretrained_urls[key], cached_file, quiet=False)
|
||||
|
||||
state_dict = torch.load(cached_file)
|
||||
model_dict = model.state_dict()
|
||||
new_state_dict = OrderedDict()
|
||||
matched_layers, discarded_layers = [], []
|
||||
|
||||
for k, v in state_dict.items():
|
||||
if k.startswith('module.'):
|
||||
k = k[7:] # discard module.
|
||||
|
||||
if k in model_dict and model_dict[k].size() == v.size():
|
||||
new_state_dict[k] = v
|
||||
matched_layers.append(k)
|
||||
else:
|
||||
discarded_layers.append(k)
|
||||
|
||||
model_dict.update(new_state_dict)
|
||||
model.load_state_dict(model_dict)
|
||||
|
||||
if len(matched_layers) == 0:
|
||||
warnings.warn(
|
||||
'The pretrained weights from "{}" cannot be loaded, '
|
||||
'please check the key names manually '
|
||||
'(** ignored and continue **)'.format(cached_file)
|
||||
)
|
||||
else:
|
||||
print(
|
||||
'Successfully loaded imagenet pretrained weights from "{}"'.
|
||||
format(cached_file)
|
||||
)
|
||||
if len(discarded_layers) > 0:
|
||||
print(
|
||||
'** The following layers are discarded '
|
||||
'due to unmatched keys or layer size: {}'.
|
||||
format(discarded_layers)
|
||||
)
|
||||
|
||||
|
||||
##########
|
||||
# Instantiation
|
||||
##########
|
||||
def osnet_x1_0(num_classes=1000, pretrained=True, loss='softmax', **kwargs):
|
||||
# standard size (width x1.0)
|
||||
model = OSNet(
|
||||
num_classes,
|
||||
blocks=[OSBlock, OSBlock, OSBlock],
|
||||
layers=[2, 2, 2],
|
||||
channels=[64, 256, 384, 512],
|
||||
loss=loss,
|
||||
**kwargs
|
||||
)
|
||||
if pretrained:
|
||||
init_pretrained_weights(model, key='osnet_x1_0')
|
||||
return model
|
||||
|
||||
|
||||
def osnet_x0_75(num_classes=1000, pretrained=True, loss='softmax', **kwargs):
|
||||
# medium size (width x0.75)
|
||||
model = OSNet(
|
||||
num_classes,
|
||||
blocks=[OSBlock, OSBlock, OSBlock],
|
||||
layers=[2, 2, 2],
|
||||
channels=[48, 192, 288, 384],
|
||||
loss=loss,
|
||||
**kwargs
|
||||
)
|
||||
if pretrained:
|
||||
init_pretrained_weights(model, key='osnet_x0_75')
|
||||
return model
|
||||
|
||||
|
||||
def osnet_x0_5(num_classes=1000, pretrained=True, loss='softmax', **kwargs):
|
||||
# tiny size (width x0.5)
|
||||
model = OSNet(
|
||||
num_classes,
|
||||
blocks=[OSBlock, OSBlock, OSBlock],
|
||||
layers=[2, 2, 2],
|
||||
channels=[32, 128, 192, 256],
|
||||
loss=loss,
|
||||
**kwargs
|
||||
)
|
||||
if pretrained:
|
||||
init_pretrained_weights(model, key='osnet_x0_5')
|
||||
return model
|
||||
|
||||
|
||||
def osnet_x0_25(num_classes=1000, pretrained=True, loss='softmax', **kwargs):
|
||||
# very tiny size (width x0.25)
|
||||
model = OSNet(
|
||||
num_classes,
|
||||
blocks=[OSBlock, OSBlock, OSBlock],
|
||||
layers=[2, 2, 2],
|
||||
channels=[16, 64, 96, 128],
|
||||
loss=loss,
|
||||
**kwargs
|
||||
)
|
||||
if pretrained:
|
||||
init_pretrained_weights(model, key='osnet_x0_25')
|
||||
return model
|
||||
|
||||
|
||||
def osnet_ibn_x1_0(
|
||||
num_classes=1000, pretrained=True, loss='softmax', **kwargs
|
||||
):
|
||||
# standard size (width x1.0) + IBN layer
|
||||
# Ref: Pan et al. Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net. ECCV, 2018.
|
||||
model = OSNet(
|
||||
num_classes,
|
||||
blocks=[OSBlock, OSBlock, OSBlock],
|
||||
layers=[2, 2, 2],
|
||||
channels=[64, 256, 384, 512],
|
||||
loss=loss,
|
||||
IN=True,
|
||||
**kwargs
|
||||
)
|
||||
if pretrained:
|
||||
init_pretrained_weights(model, key='osnet_ibn_x1_0')
|
||||
return model
|
@ -0,0 +1,3 @@
|
||||
"""
|
||||
file for reid_models folder
|
||||
"""
|
@ -0,0 +1,157 @@
|
||||
"""
|
||||
file for DeepSORT Re-ID model
|
||||
"""
|
||||
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
import numpy as np
|
||||
import cv2
|
||||
import logging
|
||||
import torchvision.transforms as transforms
|
||||
|
||||
|
||||
class BasicBlock(nn.Module):
|
||||
def __init__(self, c_in, c_out, is_downsample=False):
|
||||
super(BasicBlock, self).__init__()
|
||||
self.is_downsample = is_downsample
|
||||
if is_downsample:
|
||||
self.conv1 = nn.Conv2d(
|
||||
c_in, c_out, 3, stride=2, padding=1, bias=False)
|
||||
else:
|
||||
self.conv1 = nn.Conv2d(
|
||||
c_in, c_out, 3, stride=1, padding=1, bias=False)
|
||||
self.bn1 = nn.BatchNorm2d(c_out)
|
||||
self.relu = nn.ReLU(True)
|
||||
self.conv2 = nn.Conv2d(c_out, c_out, 3, stride=1,
|
||||
padding=1, bias=False)
|
||||
self.bn2 = nn.BatchNorm2d(c_out)
|
||||
if is_downsample:
|
||||
self.downsample = nn.Sequential(
|
||||
nn.Conv2d(c_in, c_out, 1, stride=2, bias=False),
|
||||
nn.BatchNorm2d(c_out)
|
||||
)
|
||||
elif c_in != c_out:
|
||||
self.downsample = nn.Sequential(
|
||||
nn.Conv2d(c_in, c_out, 1, stride=1, bias=False),
|
||||
nn.BatchNorm2d(c_out)
|
||||
)
|
||||
self.is_downsample = True
|
||||
|
||||
def forward(self, x):
|
||||
y = self.conv1(x)
|
||||
y = self.bn1(y)
|
||||
y = self.relu(y)
|
||||
y = self.conv2(y)
|
||||
y = self.bn2(y)
|
||||
if self.is_downsample:
|
||||
x = self.downsample(x)
|
||||
return F.relu(x.add(y), True)
|
||||
|
||||
|
||||
def make_layers(c_in, c_out, repeat_times, is_downsample=False):
|
||||
blocks = []
|
||||
for i in range(repeat_times):
|
||||
if i == 0:
|
||||
blocks += [BasicBlock(c_in, c_out, is_downsample=is_downsample), ]
|
||||
else:
|
||||
blocks += [BasicBlock(c_out, c_out), ]
|
||||
return nn.Sequential(*blocks)
|
||||
|
||||
|
||||
class Net(nn.Module):
|
||||
def __init__(self, num_classes=751, reid=False):
|
||||
super(Net, self).__init__()
|
||||
# 3 128 64
|
||||
self.conv = nn.Sequential(
|
||||
nn.Conv2d(3, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64),
|
||||
nn.ReLU(inplace=True),
|
||||
# nn.Conv2d(32,32,3,stride=1,padding=1),
|
||||
# nn.BatchNorm2d(32),
|
||||
# nn.ReLU(inplace=True),
|
||||
nn.MaxPool2d(3, 2, padding=1),
|
||||
)
|
||||
# 32 64 32
|
||||
self.layer1 = make_layers(64, 64, 2, False)
|
||||
# 32 64 32
|
||||
self.layer2 = make_layers(64, 128, 2, True)
|
||||
# 64 32 16
|
||||
self.layer3 = make_layers(128, 256, 2, True)
|
||||
# 128 16 8
|
||||
self.layer4 = make_layers(256, 512, 2, True)
|
||||
# 256 8 4
|
||||
self.avgpool = nn.AvgPool2d((8, 4), 1)
|
||||
# 256 1 1
|
||||
self.reid = reid
|
||||
self.classifier = nn.Sequential(
|
||||
nn.Linear(512, 256),
|
||||
nn.BatchNorm1d(256),
|
||||
nn.ReLU(inplace=True),
|
||||
nn.Dropout(),
|
||||
nn.Linear(256, num_classes),
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
x = self.layer1(x)
|
||||
x = self.layer2(x)
|
||||
x = self.layer3(x)
|
||||
x = self.layer4(x)
|
||||
x = self.avgpool(x)
|
||||
x = x.view(x.size(0), -1)
|
||||
# B x 128
|
||||
if self.reid:
|
||||
x = x.div(x.norm(p=2, dim=1, keepdim=True))
|
||||
return x
|
||||
# classifier
|
||||
x = self.classifier(x)
|
||||
return x
|
||||
|
||||
|
||||
class Extractor(object):
|
||||
def __init__(self, model_path, use_cuda=True):
|
||||
self.net = Net(reid=True)
|
||||
self.device = "cuda" if torch.cuda.is_available() and use_cuda else "cpu"
|
||||
state_dict = torch.load(model_path, map_location=torch.device(self.device))[
|
||||
'net_dict']
|
||||
self.net.load_state_dict(state_dict)
|
||||
logger = logging.getLogger("root.tracker")
|
||||
logger.info("Loading weights from {}... Done!".format(model_path))
|
||||
self.net.to(self.device)
|
||||
self.size = (64, 128)
|
||||
self.norm = transforms.Compose([
|
||||
transforms.ToTensor(),
|
||||
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
|
||||
])
|
||||
|
||||
def _preprocess(self, im_crops):
|
||||
"""
|
||||
TODO:
|
||||
1. to float with scale from 0 to 1
|
||||
2. resize to (64, 128) as Market1501 dataset did
|
||||
3. concatenate to a numpy array
|
||||
3. to torch Tensor
|
||||
4. normalize
|
||||
"""
|
||||
def _resize(im, size):
|
||||
try:
|
||||
return cv2.resize(im.astype(np.float32)/255., size)
|
||||
except:
|
||||
print('Error: size in bbox exists zero, ', im.shape)
|
||||
exit(0)
|
||||
|
||||
im_batch = torch.cat([self.norm(_resize(im, self.size)).unsqueeze(
|
||||
0) for im in im_crops], dim=0).float()
|
||||
return im_batch
|
||||
|
||||
def __call__(self, im_crops):
|
||||
if isinstance(im_crops, list):
|
||||
im_batch = self._preprocess(im_crops)
|
||||
else:
|
||||
im_batch = im_crops
|
||||
|
||||
with torch.no_grad():
|
||||
im_batch = im_batch.to(self.device)
|
||||
features = self.net(im_batch)
|
||||
return features
|
@ -0,0 +1,273 @@
|
||||
"""
|
||||
load checkpoint file
|
||||
copied from https://github.com/mikel-brostrom/Yolov5_StrongSORT_OSNet
|
||||
"""
|
||||
from __future__ import division, print_function, absolute_import
|
||||
import pickle
|
||||
import shutil
|
||||
import os.path as osp
|
||||
import warnings
|
||||
from functools import partial
|
||||
from collections import OrderedDict
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
__all__ = [
|
||||
'save_checkpoint', 'load_checkpoint', 'resume_from_checkpoint',
|
||||
'open_all_layers', 'open_specified_layers', 'count_num_param',
|
||||
'load_pretrained_weights'
|
||||
]
|
||||
|
||||
def load_checkpoint(fpath):
|
||||
r"""Loads checkpoint.
|
||||
|
||||
``UnicodeDecodeError`` can be well handled, which means
|
||||
python2-saved files can be read from python3.
|
||||
|
||||
Args:
|
||||
fpath (str): path to checkpoint.
|
||||
|
||||
Returns:
|
||||
dict
|
||||
|
||||
Examples::
|
||||
>>> from torchreid.utils import load_checkpoint
|
||||
>>> fpath = 'log/my_model/model.pth.tar-10'
|
||||
>>> checkpoint = load_checkpoint(fpath)
|
||||
"""
|
||||
if fpath is None:
|
||||
raise ValueError('File path is None')
|
||||
fpath = osp.abspath(osp.expanduser(fpath))
|
||||
if not osp.exists(fpath):
|
||||
raise FileNotFoundError('File is not found at "{}"'.format(fpath))
|
||||
map_location = None if torch.cuda.is_available() else 'cpu'
|
||||
try:
|
||||
checkpoint = torch.load(fpath, map_location=map_location)
|
||||
except UnicodeDecodeError:
|
||||
pickle.load = partial(pickle.load, encoding="latin1")
|
||||
pickle.Unpickler = partial(pickle.Unpickler, encoding="latin1")
|
||||
checkpoint = torch.load(
|
||||
fpath, pickle_module=pickle, map_location=map_location
|
||||
)
|
||||
except Exception:
|
||||
print('Unable to load checkpoint from "{}"'.format(fpath))
|
||||
raise
|
||||
return checkpoint
|
||||
|
||||
|
||||
def resume_from_checkpoint(fpath, model, optimizer=None, scheduler=None):
|
||||
r"""Resumes training from a checkpoint.
|
||||
|
||||
This will load (1) model weights and (2) ``state_dict``
|
||||
of optimizer if ``optimizer`` is not None.
|
||||
|
||||
Args:
|
||||
fpath (str): path to checkpoint.
|
||||
model (nn.Module): model.
|
||||
optimizer (Optimizer, optional): an Optimizer.
|
||||
scheduler (LRScheduler, optional): an LRScheduler.
|
||||
|
||||
Returns:
|
||||
int: start_epoch.
|
||||
|
||||
Examples::
|
||||
>>> from torchreid.utils import resume_from_checkpoint
|
||||
>>> fpath = 'log/my_model/model.pth.tar-10'
|
||||
>>> start_epoch = resume_from_checkpoint(
|
||||
>>> fpath, model, optimizer, scheduler
|
||||
>>> )
|
||||
"""
|
||||
print('Loading checkpoint from "{}"'.format(fpath))
|
||||
checkpoint = load_checkpoint(fpath)
|
||||
model.load_state_dict(checkpoint['state_dict'])
|
||||
print('Loaded model weights')
|
||||
if optimizer is not None and 'optimizer' in checkpoint.keys():
|
||||
optimizer.load_state_dict(checkpoint['optimizer'])
|
||||
print('Loaded optimizer')
|
||||
if scheduler is not None and 'scheduler' in checkpoint.keys():
|
||||
scheduler.load_state_dict(checkpoint['scheduler'])
|
||||
print('Loaded scheduler')
|
||||
start_epoch = checkpoint['epoch']
|
||||
print('Last epoch = {}'.format(start_epoch))
|
||||
if 'rank1' in checkpoint.keys():
|
||||
print('Last rank1 = {:.1%}'.format(checkpoint['rank1']))
|
||||
return start_epoch
|
||||
|
||||
|
||||
def adjust_learning_rate(
|
||||
optimizer,
|
||||
base_lr,
|
||||
epoch,
|
||||
stepsize=20,
|
||||
gamma=0.1,
|
||||
linear_decay=False,
|
||||
final_lr=0,
|
||||
max_epoch=100
|
||||
):
|
||||
r"""Adjusts learning rate.
|
||||
|
||||
Deprecated.
|
||||
"""
|
||||
if linear_decay:
|
||||
# linearly decay learning rate from base_lr to final_lr
|
||||
frac_done = epoch / max_epoch
|
||||
lr = frac_done*final_lr + (1.-frac_done) * base_lr
|
||||
else:
|
||||
# decay learning rate by gamma for every stepsize
|
||||
lr = base_lr * (gamma**(epoch // stepsize))
|
||||
|
||||
for param_group in optimizer.param_groups:
|
||||
param_group['lr'] = lr
|
||||
|
||||
|
||||
def set_bn_to_eval(m):
|
||||
r"""Sets BatchNorm layers to eval mode."""
|
||||
# 1. no update for running mean and var
|
||||
# 2. scale and shift parameters are still trainable
|
||||
classname = m.__class__.__name__
|
||||
if classname.find('BatchNorm') != -1:
|
||||
m.eval()
|
||||
|
||||
|
||||
def open_all_layers(model):
|
||||
r"""Opens all layers in model for training.
|
||||
|
||||
Examples::
|
||||
>>> from torchreid.utils import open_all_layers
|
||||
>>> open_all_layers(model)
|
||||
"""
|
||||
model.train()
|
||||
for p in model.parameters():
|
||||
p.requires_grad = True
|
||||
|
||||
|
||||
def open_specified_layers(model, open_layers):
|
||||
r"""Opens specified layers in model for training while keeping
|
||||
other layers frozen.
|
||||
|
||||
Args:
|
||||
model (nn.Module): neural net model.
|
||||
open_layers (str or list): layers open for training.
|
||||
|
||||
Examples::
|
||||
>>> from torchreid.utils import open_specified_layers
|
||||
>>> # Only model.classifier will be updated.
|
||||
>>> open_layers = 'classifier'
|
||||
>>> open_specified_layers(model, open_layers)
|
||||
>>> # Only model.fc and model.classifier will be updated.
|
||||
>>> open_layers = ['fc', 'classifier']
|
||||
>>> open_specified_layers(model, open_layers)
|
||||
"""
|
||||
if isinstance(model, nn.DataParallel):
|
||||
model = model.module
|
||||
|
||||
if isinstance(open_layers, str):
|
||||
open_layers = [open_layers]
|
||||
|
||||
for layer in open_layers:
|
||||
assert hasattr(
|
||||
model, layer
|
||||
), '"{}" is not an attribute of the model, please provide the correct name'.format(
|
||||
layer
|
||||
)
|
||||
|
||||
for name, module in model.named_children():
|
||||
if name in open_layers:
|
||||
module.train()
|
||||
for p in module.parameters():
|
||||
p.requires_grad = True
|
||||
else:
|
||||
module.eval()
|
||||
for p in module.parameters():
|
||||
p.requires_grad = False
|
||||
|
||||
|
||||
def count_num_param(model):
|
||||
r"""Counts number of parameters in a model while ignoring ``self.classifier``.
|
||||
|
||||
Args:
|
||||
model (nn.Module): network model.
|
||||
|
||||
Examples::
|
||||
>>> from torchreid.utils import count_num_param
|
||||
>>> model_size = count_num_param(model)
|
||||
|
||||
.. warning::
|
||||
|
||||
This method is deprecated in favor of
|
||||
``torchreid.utils.compute_model_complexity``.
|
||||
"""
|
||||
warnings.warn(
|
||||
'This method is deprecated and will be removed in the future.'
|
||||
)
|
||||
|
||||
num_param = sum(p.numel() for p in model.parameters())
|
||||
|
||||
if isinstance(model, nn.DataParallel):
|
||||
model = model.module
|
||||
|
||||
if hasattr(model,
|
||||
'classifier') and isinstance(model.classifier, nn.Module):
|
||||
# we ignore the classifier because it is unused at test time
|
||||
num_param -= sum(p.numel() for p in model.classifier.parameters())
|
||||
|
||||
return num_param
|
||||
|
||||
|
||||
def load_pretrained_weights(model, weight_path):
|
||||
r"""Loads pretrianed weights to model.
|
||||
|
||||
Features::
|
||||
- Incompatible layers (unmatched in name or size) will be ignored.
|
||||
- Can automatically deal with keys containing "module.".
|
||||
|
||||
Args:
|
||||
model (nn.Module): network model.
|
||||
weight_path (str): path to pretrained weights.
|
||||
|
||||
Examples::
|
||||
>>> from torchreid.utils import load_pretrained_weights
|
||||
>>> weight_path = 'log/my_model/model-best.pth.tar'
|
||||
>>> load_pretrained_weights(model, weight_path)
|
||||
"""
|
||||
checkpoint = load_checkpoint(weight_path)
|
||||
if 'state_dict' in checkpoint:
|
||||
state_dict = checkpoint['state_dict']
|
||||
else:
|
||||
state_dict = checkpoint
|
||||
|
||||
model_dict = model.state_dict()
|
||||
new_state_dict = OrderedDict()
|
||||
matched_layers, discarded_layers = [], []
|
||||
|
||||
for k, v in state_dict.items():
|
||||
if k.startswith('module.'):
|
||||
k = k[7:] # discard module.
|
||||
|
||||
if k in model_dict and model_dict[k].size() == v.size():
|
||||
new_state_dict[k] = v
|
||||
matched_layers.append(k)
|
||||
else:
|
||||
discarded_layers.append(k)
|
||||
|
||||
model_dict.update(new_state_dict)
|
||||
model.load_state_dict(model_dict)
|
||||
|
||||
if len(matched_layers) == 0:
|
||||
warnings.warn(
|
||||
'The pretrained weights "{}" cannot be loaded, '
|
||||
'please check the key names manually '
|
||||
'(** ignored and continue **)'.format(weight_path)
|
||||
)
|
||||
else:
|
||||
print(
|
||||
'Successfully loaded pretrained weights from "{}"'.
|
||||
format(weight_path)
|
||||
)
|
||||
if len(discarded_layers) > 0:
|
||||
print(
|
||||
'** The following layers are discarded '
|
||||
'due to unmatched keys or layer size: {}'.
|
||||
format(discarded_layers)
|
||||
)
|
169
yolov7-tracker-example/tracker/trackers/sort_tracker.py
Normal file
169
yolov7-tracker-example/tracker/trackers/sort_tracker.py
Normal file
@ -0,0 +1,169 @@
|
||||
"""
|
||||
Sort
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
from collections import deque
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .tracklet import Tracklet
|
||||
from .matching import *
|
||||
|
||||
class SortTracker(object):
|
||||
def __init__(self, args, frame_rate=30):
|
||||
self.tracked_tracklets = [] # type: list[Tracklet]
|
||||
self.lost_tracklets = [] # type: list[Tracklet]
|
||||
self.removed_tracklets = [] # type: list[Tracklet]
|
||||
|
||||
self.frame_id = 0
|
||||
self.args = args
|
||||
|
||||
self.det_thresh = args.conf_thresh + 0.1
|
||||
self.buffer_size = int(frame_rate / 30.0 * args.track_buffer)
|
||||
self.max_time_lost = self.buffer_size
|
||||
|
||||
self.motion = args.kalman_format
|
||||
|
||||
def update(self, output_results, img, ori_img):
|
||||
"""
|
||||
output_results: processed detections (scale to original size) tlbr format
|
||||
"""
|
||||
|
||||
self.frame_id += 1
|
||||
activated_tracklets = []
|
||||
refind_tracklets = []
|
||||
lost_tracklets = []
|
||||
removed_tracklets = []
|
||||
|
||||
scores = output_results[:, 4]
|
||||
bboxes = output_results[:, :4]
|
||||
categories = output_results[:, -1]
|
||||
|
||||
remain_inds = scores > self.args.conf_thresh
|
||||
|
||||
dets = bboxes[remain_inds]
|
||||
|
||||
cates = categories[remain_inds]
|
||||
|
||||
scores_keep = scores[remain_inds]
|
||||
|
||||
if len(dets) > 0:
|
||||
'''Detections'''
|
||||
detections = [Tracklet(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets, scores_keep, cates)]
|
||||
else:
|
||||
detections = []
|
||||
|
||||
''' Add newly detected tracklets to tracked_tracklets'''
|
||||
unconfirmed = []
|
||||
tracked_tracklets = [] # type: list[Tracklet]
|
||||
for track in self.tracked_tracklets:
|
||||
if not track.is_activated:
|
||||
unconfirmed.append(track)
|
||||
else:
|
||||
tracked_tracklets.append(track)
|
||||
|
||||
''' Step 2: First association, with high score detection boxes'''
|
||||
tracklet_pool = joint_tracklets(tracked_tracklets, self.lost_tracklets)
|
||||
|
||||
# Predict the current location with Kalman
|
||||
for tracklet in tracklet_pool:
|
||||
tracklet.predict()
|
||||
|
||||
dists = iou_distance(tracklet_pool, detections)
|
||||
|
||||
matches, u_track, u_detection = linear_assignment(dists, thresh=0.9)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_pool[itracked]
|
||||
det = detections[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
|
||||
'''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
|
||||
detections = [detections[i] for i in u_detection]
|
||||
dists = iou_distance(unconfirmed, detections)
|
||||
|
||||
matches, u_unconfirmed, u_detection = linear_assignment(dists, thresh=0.7)
|
||||
|
||||
for itracked, idet in matches:
|
||||
unconfirmed[itracked].update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(unconfirmed[itracked])
|
||||
for it in u_unconfirmed:
|
||||
track = unconfirmed[it]
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
""" Step 3: Init new tracklets"""
|
||||
for inew in u_detection:
|
||||
track = detections[inew]
|
||||
if track.score < self.det_thresh:
|
||||
continue
|
||||
track.activate(self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
|
||||
""" Step 4: Update state"""
|
||||
for track in self.lost_tracklets:
|
||||
if self.frame_id - track.end_frame > self.max_time_lost:
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
# print('Ramained match {} s'.format(t4-t3))
|
||||
|
||||
self.tracked_tracklets = [t for t in self.tracked_tracklets if t.state == TrackState.Tracked]
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, activated_tracklets)
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, refind_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.tracked_tracklets)
|
||||
self.lost_tracklets.extend(lost_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.removed_tracklets)
|
||||
self.removed_tracklets.extend(removed_tracklets)
|
||||
self.tracked_tracklets, self.lost_tracklets = remove_duplicate_tracklets(self.tracked_tracklets, self.lost_tracklets)
|
||||
# get scores of lost tracks
|
||||
output_tracklets = [track for track in self.tracked_tracklets if track.is_activated]
|
||||
|
||||
return output_tracklets
|
||||
|
||||
|
||||
def joint_tracklets(tlista, tlistb):
|
||||
exists = {}
|
||||
res = []
|
||||
for t in tlista:
|
||||
exists[t.track_id] = 1
|
||||
res.append(t)
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if not exists.get(tid, 0):
|
||||
exists[tid] = 1
|
||||
res.append(t)
|
||||
return res
|
||||
|
||||
|
||||
def sub_tracklets(tlista, tlistb):
|
||||
tracklets = {}
|
||||
for t in tlista:
|
||||
tracklets[t.track_id] = t
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if tracklets.get(tid, 0):
|
||||
del tracklets[tid]
|
||||
return list(tracklets.values())
|
||||
|
||||
|
||||
def remove_duplicate_tracklets(trackletsa, trackletsb):
|
||||
pdist = iou_distance(trackletsa, trackletsb)
|
||||
pairs = np.where(pdist < 0.15)
|
||||
dupa, dupb = list(), list()
|
||||
for p, q in zip(*pairs):
|
||||
timep = trackletsa[p].frame_id - trackletsa[p].start_frame
|
||||
timeq = trackletsb[q].frame_id - trackletsb[q].start_frame
|
||||
if timep > timeq:
|
||||
dupb.append(q)
|
||||
else:
|
||||
dupa.append(p)
|
||||
resa = [t for i, t in enumerate(trackletsa) if not i in dupa]
|
||||
resb = [t for i, t in enumerate(trackletsb) if not i in dupb]
|
||||
return resa, resb
|
338
yolov7-tracker-example/tracker/trackers/sparse_tracker.py
Normal file
338
yolov7-tracker-example/tracker/trackers/sparse_tracker.py
Normal file
@ -0,0 +1,338 @@
|
||||
"""
|
||||
Bot sort
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
from torchvision.ops import nms
|
||||
|
||||
import cv2
|
||||
import torchvision.transforms as T
|
||||
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .tracklet import Tracklet, Tracklet_w_depth
|
||||
from .matching import *
|
||||
|
||||
from .reid_models.OSNet import *
|
||||
from .reid_models.load_model_tools import load_pretrained_weights
|
||||
from .reid_models.deepsort_reid import Extractor
|
||||
|
||||
from .camera_motion_compensation import GMC
|
||||
|
||||
REID_MODEL_DICT = {
|
||||
'osnet_x1_0': osnet_x1_0,
|
||||
'osnet_x0_75': osnet_x0_75,
|
||||
'osnet_x0_5': osnet_x0_5,
|
||||
'osnet_x0_25': osnet_x0_25,
|
||||
'deepsort': Extractor
|
||||
}
|
||||
|
||||
|
||||
def load_reid_model(reid_model, reid_model_path):
|
||||
|
||||
if 'osnet' in reid_model:
|
||||
func = REID_MODEL_DICT[reid_model]
|
||||
model = func(num_classes=1, pretrained=False, )
|
||||
load_pretrained_weights(model, reid_model_path)
|
||||
model.cuda().eval()
|
||||
|
||||
elif 'deepsort' in reid_model:
|
||||
model = REID_MODEL_DICT[reid_model](reid_model_path, use_cuda=True)
|
||||
|
||||
else:
|
||||
raise NotImplementedError
|
||||
|
||||
return model
|
||||
|
||||
class SparseTracker(object):
|
||||
def __init__(self, args, frame_rate=30):
|
||||
self.tracked_tracklets = [] # type: list[Tracklet]
|
||||
self.lost_tracklets = [] # type: list[Tracklet]
|
||||
self.removed_tracklets = [] # type: list[Tracklet]
|
||||
|
||||
self.frame_id = 0
|
||||
self.args = args
|
||||
|
||||
self.det_thresh = args.conf_thresh + 0.1
|
||||
self.buffer_size = int(frame_rate / 30.0 * args.track_buffer)
|
||||
self.max_time_lost = self.buffer_size
|
||||
|
||||
self.motion = args.kalman_format
|
||||
|
||||
# camera motion compensation module
|
||||
self.gmc = GMC(method='orb', downscale=2, verbose=None)
|
||||
|
||||
def get_deep_range(self, obj, step):
|
||||
col = []
|
||||
for t in obj:
|
||||
lend = (t.deep_vec)[2]
|
||||
col.append(lend)
|
||||
max_len, mix_len = max(col), min(col)
|
||||
if max_len != mix_len:
|
||||
deep_range =np.arange(mix_len, max_len, (max_len - mix_len + 1) / step)
|
||||
if deep_range[-1] < max_len:
|
||||
deep_range = np.concatenate([deep_range, np.array([max_len],)])
|
||||
deep_range[0] = np.floor(deep_range[0])
|
||||
deep_range[-1] = np.ceil(deep_range[-1])
|
||||
else:
|
||||
deep_range = [mix_len,]
|
||||
mask = self.get_sub_mask(deep_range, col)
|
||||
return mask
|
||||
|
||||
def get_sub_mask(self, deep_range, col):
|
||||
mix_len=deep_range[0]
|
||||
max_len=deep_range[-1]
|
||||
if max_len == mix_len:
|
||||
lc = mix_len
|
||||
mask = []
|
||||
for d in deep_range:
|
||||
if d > deep_range[0] and d < deep_range[-1]:
|
||||
mask.append((col >= lc) & (col < d))
|
||||
lc = d
|
||||
elif d == deep_range[-1]:
|
||||
mask.append((col >= lc) & (col <= d))
|
||||
lc = d
|
||||
else:
|
||||
lc = d
|
||||
continue
|
||||
return mask
|
||||
|
||||
# core function
|
||||
def DCM(self, detections, tracks, activated_tracklets, refind_tracklets, levels, thresh, is_fuse):
|
||||
if len(detections) > 0:
|
||||
det_mask = self.get_deep_range(detections, levels)
|
||||
else:
|
||||
det_mask = []
|
||||
|
||||
if len(tracks)!=0:
|
||||
track_mask = self.get_deep_range(tracks, levels)
|
||||
else:
|
||||
track_mask = []
|
||||
|
||||
u_detection, u_tracks, res_det, res_track = [], [], [], []
|
||||
if len(track_mask) != 0:
|
||||
if len(track_mask) < len(det_mask):
|
||||
for i in range(len(det_mask) - len(track_mask)):
|
||||
idx = np.argwhere(det_mask[len(track_mask) + i] == True)
|
||||
for idd in idx:
|
||||
res_det.append(detections[idd[0]])
|
||||
elif len(track_mask) > len(det_mask):
|
||||
for i in range(len(track_mask) - len(det_mask)):
|
||||
idx = np.argwhere(track_mask[len(det_mask) + i] == True)
|
||||
for idd in idx:
|
||||
res_track.append(tracks[idd[0]])
|
||||
|
||||
for dm, tm in zip(det_mask, track_mask):
|
||||
det_idx = np.argwhere(dm == True)
|
||||
trk_idx = np.argwhere(tm == True)
|
||||
|
||||
# search det
|
||||
det_ = []
|
||||
for idd in det_idx:
|
||||
det_.append(detections[idd[0]])
|
||||
det_ = det_ + u_detection
|
||||
# search trk
|
||||
track_ = []
|
||||
for idt in trk_idx:
|
||||
track_.append(tracks[idt[0]])
|
||||
# update trk
|
||||
track_ = track_ + u_tracks
|
||||
|
||||
dists = iou_distance(track_, det_)
|
||||
|
||||
matches, u_track_, u_det_ = linear_assignment(dists, thresh)
|
||||
for itracked, idet in matches:
|
||||
track = track_[itracked]
|
||||
det = det_[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(det_[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
u_tracks = [track_[t] for t in u_track_]
|
||||
u_detection = [det_[t] for t in u_det_]
|
||||
|
||||
u_tracks = u_tracks + res_track
|
||||
u_detection = u_detection + res_det
|
||||
|
||||
else:
|
||||
u_detection = detections
|
||||
|
||||
return activated_tracklets, refind_tracklets, u_tracks, u_detection
|
||||
|
||||
|
||||
def update(self, output_results, img, ori_img):
|
||||
"""
|
||||
output_results: processed detections (scale to original size) tlwh format
|
||||
"""
|
||||
|
||||
self.frame_id += 1
|
||||
activated_tracklets = []
|
||||
refind_tracklets = []
|
||||
lost_tracklets = []
|
||||
removed_tracklets = []
|
||||
|
||||
scores = output_results[:, 4]
|
||||
bboxes = output_results[:, :4]
|
||||
categories = output_results[:, -1]
|
||||
|
||||
remain_inds = scores > self.args.conf_thresh
|
||||
inds_low = scores > 0.1
|
||||
inds_high = scores < self.args.conf_thresh
|
||||
|
||||
inds_second = np.logical_and(inds_low, inds_high)
|
||||
dets_second = bboxes[inds_second]
|
||||
dets = bboxes[remain_inds]
|
||||
|
||||
cates = categories[remain_inds]
|
||||
cates_second = categories[inds_second]
|
||||
|
||||
scores_keep = scores[remain_inds]
|
||||
scores_second = scores[inds_second]
|
||||
|
||||
if len(dets) > 0:
|
||||
detections = [Tracklet_w_depth(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets, scores_keep, cates)]
|
||||
else:
|
||||
detections = []
|
||||
|
||||
''' Step 1: Add newly detected tracklets to tracked_tracklets'''
|
||||
unconfirmed = []
|
||||
tracked_tracklets = [] # type: list[Tracklet]
|
||||
for track in self.tracked_tracklets:
|
||||
if not track.is_activated:
|
||||
unconfirmed.append(track)
|
||||
else:
|
||||
tracked_tracklets.append(track)
|
||||
|
||||
''' Step 2: First association, with high score detection boxes, depth cascade mathcing'''
|
||||
tracklet_pool = joint_tracklets(tracked_tracklets, self.lost_tracklets)
|
||||
|
||||
# Predict the current location with Kalman
|
||||
for tracklet in tracklet_pool:
|
||||
tracklet.predict()
|
||||
|
||||
# Camera motion compensation
|
||||
warp = self.gmc.apply(ori_img, dets)
|
||||
self.gmc.multi_gmc(tracklet_pool, warp)
|
||||
self.gmc.multi_gmc(unconfirmed, warp)
|
||||
|
||||
# depth cascade matching
|
||||
activated_tracklets, refind_tracklets, u_track, u_detection_high = self.DCM(
|
||||
detections,
|
||||
tracklet_pool,
|
||||
activated_tracklets,
|
||||
refind_tracklets,
|
||||
levels=3,
|
||||
thresh=0.75,
|
||||
is_fuse=True)
|
||||
|
||||
''' Step 3: Second association, with low score detection boxes, depth cascade mathcing'''
|
||||
if len(dets_second) > 0:
|
||||
'''Detections'''
|
||||
detections_second = [Tracklet_w_depth(tlwh, s, cate, motion=self.motion) for
|
||||
(tlwh, s, cate) in zip(dets_second, scores_second, cates_second)]
|
||||
else:
|
||||
detections_second = []
|
||||
|
||||
r_tracked_tracklets = [t for t in u_track if t.state == TrackState.Tracked]
|
||||
|
||||
activated_tracklets, refind_tracklets, u_track, u_detection_sec = self.DCM(
|
||||
detections_second,
|
||||
r_tracked_tracklets,
|
||||
activated_tracklets,
|
||||
refind_tracklets,
|
||||
levels=3,
|
||||
thresh=0.3,
|
||||
is_fuse=False)
|
||||
|
||||
for track in u_track:
|
||||
if not track.state == TrackState.Lost:
|
||||
track.mark_lost()
|
||||
lost_tracklets.append(track)
|
||||
|
||||
|
||||
'''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
|
||||
detections = u_detection_high
|
||||
dists = iou_distance(unconfirmed, detections)
|
||||
|
||||
matches, u_unconfirmed, u_detection = linear_assignment(dists, thresh=0.7)
|
||||
|
||||
for itracked, idet in matches:
|
||||
unconfirmed[itracked].update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(unconfirmed[itracked])
|
||||
for it in u_unconfirmed:
|
||||
track = unconfirmed[it]
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
""" Step 4: Init new tracklets"""
|
||||
for inew in u_detection:
|
||||
track = detections[inew]
|
||||
if track.score < self.det_thresh:
|
||||
continue
|
||||
track.activate(self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
|
||||
""" Step 5: Update state"""
|
||||
for track in self.lost_tracklets:
|
||||
if self.frame_id - track.end_frame > self.max_time_lost:
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
# print('Ramained match {} s'.format(t4-t3))
|
||||
|
||||
self.tracked_tracklets = [t for t in self.tracked_tracklets if t.state == TrackState.Tracked]
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, activated_tracklets)
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, refind_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.tracked_tracklets)
|
||||
self.lost_tracklets.extend(lost_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.removed_tracklets)
|
||||
self.removed_tracklets.extend(removed_tracklets)
|
||||
self.tracked_tracklets, self.lost_tracklets = remove_duplicate_tracklets(self.tracked_tracklets, self.lost_tracklets)
|
||||
# get scores of lost tracks
|
||||
output_tracklets = [track for track in self.tracked_tracklets if track.is_activated]
|
||||
|
||||
return output_tracklets
|
||||
|
||||
|
||||
def joint_tracklets(tlista, tlistb):
|
||||
exists = {}
|
||||
res = []
|
||||
for t in tlista:
|
||||
exists[t.track_id] = 1
|
||||
res.append(t)
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if not exists.get(tid, 0):
|
||||
exists[tid] = 1
|
||||
res.append(t)
|
||||
return res
|
||||
|
||||
|
||||
def sub_tracklets(tlista, tlistb):
|
||||
tracklets = {}
|
||||
for t in tlista:
|
||||
tracklets[t.track_id] = t
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if tracklets.get(tid, 0):
|
||||
del tracklets[tid]
|
||||
return list(tracklets.values())
|
||||
|
||||
|
||||
def remove_duplicate_tracklets(trackletsa, trackletsb):
|
||||
pdist = iou_distance(trackletsa, trackletsb)
|
||||
pairs = np.where(pdist < 0.15)
|
||||
dupa, dupb = list(), list()
|
||||
for p, q in zip(*pairs):
|
||||
timep = trackletsa[p].frame_id - trackletsa[p].start_frame
|
||||
timeq = trackletsb[q].frame_id - trackletsb[q].start_frame
|
||||
if timep > timeq:
|
||||
dupb.append(q)
|
||||
else:
|
||||
dupa.append(p)
|
||||
resa = [t for i, t in enumerate(trackletsa) if not i in dupa]
|
||||
resb = [t for i, t in enumerate(trackletsb) if not i in dupb]
|
||||
return resa, resb
|
327
yolov7-tracker-example/tracker/trackers/strongsort_tracker.py
Normal file
327
yolov7-tracker-example/tracker/trackers/strongsort_tracker.py
Normal file
@ -0,0 +1,327 @@
|
||||
"""
|
||||
Deep Sort
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
from torchvision.ops import nms
|
||||
|
||||
import cv2
|
||||
import torchvision.transforms as T
|
||||
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .tracklet import Tracklet, Tracklet_w_reid
|
||||
from .matching import *
|
||||
|
||||
from .reid_models.OSNet import *
|
||||
from .reid_models.load_model_tools import load_pretrained_weights
|
||||
from .reid_models.deepsort_reid import Extractor
|
||||
|
||||
REID_MODEL_DICT = {
|
||||
'osnet_x1_0': osnet_x1_0,
|
||||
'osnet_x0_75': osnet_x0_75,
|
||||
'osnet_x0_5': osnet_x0_5,
|
||||
'osnet_x0_25': osnet_x0_25,
|
||||
'deepsort': Extractor
|
||||
}
|
||||
|
||||
|
||||
def load_reid_model(reid_model, reid_model_path):
|
||||
|
||||
if 'osnet' in reid_model:
|
||||
func = REID_MODEL_DICT[reid_model]
|
||||
model = func(num_classes=1, pretrained=False, )
|
||||
load_pretrained_weights(model, reid_model_path)
|
||||
model.cuda().eval()
|
||||
|
||||
elif 'deepsort' in reid_model:
|
||||
model = REID_MODEL_DICT[reid_model](reid_model_path, use_cuda=True)
|
||||
|
||||
else:
|
||||
raise NotImplementedError
|
||||
|
||||
return model
|
||||
|
||||
|
||||
class StrongSortTracker(object):
|
||||
|
||||
def __init__(self, args, frame_rate=30):
|
||||
self.tracked_tracklets = [] # type: list[Tracklet]
|
||||
self.lost_tracklets = [] # type: list[Tracklet]
|
||||
self.removed_tracklets = [] # type: list[Tracklet]
|
||||
|
||||
self.frame_id = 0
|
||||
self.args = args
|
||||
|
||||
self.det_thresh = args.conf_thresh + 0.1
|
||||
self.buffer_size = int(frame_rate / 30.0 * args.track_buffer)
|
||||
self.max_time_lost = self.buffer_size
|
||||
|
||||
self.motion = args.kalman_format
|
||||
|
||||
self.with_reid = not args.discard_reid
|
||||
|
||||
self.reid_model, self.crop_transforms = None, None
|
||||
if self.with_reid:
|
||||
self.reid_model = load_reid_model(args.reid_model, args.reid_model_path)
|
||||
self.crop_transforms = T.Compose([
|
||||
# T.ToPILImage(),
|
||||
# T.Resize(size=(256, 128)),
|
||||
T.ToTensor(), # (c, 128, 256)
|
||||
T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
|
||||
])
|
||||
|
||||
self.bbox_crop_size = (64, 128) if 'deepsort' in args.reid_model else (128, 128)
|
||||
|
||||
self.lambda_ = 0.98 # the coef of cost mix in eq. 10 in paper
|
||||
|
||||
|
||||
def reid_preprocess(self, obj_bbox):
|
||||
"""
|
||||
preprocess cropped object bboxes
|
||||
|
||||
obj_bbox: np.ndarray, shape=(h_obj, w_obj, c)
|
||||
|
||||
return:
|
||||
torch.Tensor of shape (c, 128, 256)
|
||||
"""
|
||||
|
||||
obj_bbox = cv2.resize(obj_bbox.astype(np.float32) / 255.0, dsize=self.bbox_crop_size) # shape: (h, w, c)
|
||||
|
||||
return self.crop_transforms(obj_bbox)
|
||||
|
||||
def get_feature(self, tlwhs, ori_img):
|
||||
"""
|
||||
get apperance feature of an object
|
||||
tlwhs: shape (num_of_objects, 4)
|
||||
ori_img: original image, np.ndarray, shape(H, W, C)
|
||||
"""
|
||||
obj_bbox = []
|
||||
|
||||
for tlwh in tlwhs:
|
||||
tlwh = list(map(int, tlwh))
|
||||
|
||||
# limit to the legal range
|
||||
tlwh[0], tlwh[1] = max(tlwh[0], 0), max(tlwh[1], 0)
|
||||
|
||||
tlbr_tensor = self.reid_preprocess(ori_img[tlwh[1]: tlwh[1] + tlwh[3], tlwh[0]: tlwh[0] + tlwh[2]])
|
||||
|
||||
obj_bbox.append(tlbr_tensor)
|
||||
|
||||
if not obj_bbox:
|
||||
return np.array([])
|
||||
|
||||
obj_bbox = torch.stack(obj_bbox, dim=0)
|
||||
obj_bbox = obj_bbox.cuda()
|
||||
|
||||
features = self.reid_model(obj_bbox) # shape: (num_of_objects, feature_dim)
|
||||
return features.cpu().detach().numpy()
|
||||
|
||||
def update(self, output_results, img, ori_img):
|
||||
"""
|
||||
output_results: processed detections (scale to original size) tlbr format
|
||||
"""
|
||||
|
||||
self.frame_id += 1
|
||||
activated_tracklets = []
|
||||
refind_tracklets = []
|
||||
lost_tracklets = []
|
||||
removed_tracklets = []
|
||||
|
||||
scores = output_results[:, 4]
|
||||
bboxes = output_results[:, :4]
|
||||
categories = output_results[:, -1]
|
||||
|
||||
remain_inds = scores > self.args.conf_thresh
|
||||
|
||||
dets = bboxes[remain_inds]
|
||||
|
||||
cates = categories[remain_inds]
|
||||
|
||||
scores_keep = scores[remain_inds]
|
||||
|
||||
features_keep = self.get_feature(tlwhs=dets[:, :4], ori_img=ori_img)
|
||||
|
||||
if len(dets) > 0:
|
||||
'''Detections'''
|
||||
detections = [Tracklet_w_reid(tlwh, s, cate, motion=self.motion, feat=feat) for
|
||||
(tlwh, s, cate, feat) in zip(dets, scores_keep, cates, features_keep)]
|
||||
else:
|
||||
detections = []
|
||||
|
||||
''' Add newly detected tracklets to tracked_tracklets'''
|
||||
unconfirmed = []
|
||||
tracked_tracklets = [] # type: list[Tracklet]
|
||||
for track in self.tracked_tracklets:
|
||||
if not track.is_activated:
|
||||
unconfirmed.append(track)
|
||||
else:
|
||||
tracked_tracklets.append(track)
|
||||
|
||||
''' Step 2: First association, with appearance'''
|
||||
tracklet_pool = joint_tracklets(tracked_tracklets, self.lost_tracklets)
|
||||
|
||||
# Predict the current location with Kalman
|
||||
for tracklet in tracklet_pool:
|
||||
tracklet.predict()
|
||||
|
||||
# vallina matching
|
||||
cost_matrix = self.gated_metric(tracklet_pool, detections)
|
||||
matches, u_track, u_detection = linear_assignment(cost_matrix, thresh=0.9)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_pool[itracked]
|
||||
det = detections[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
'''Step 3: Second association, with iou'''
|
||||
tracklet_for_iou = [tracklet_pool[i] for i in u_track if tracklet_pool[i].state == TrackState.Tracked]
|
||||
detection_for_iou = [detections[i] for i in u_detection]
|
||||
|
||||
dists = iou_distance(tracklet_for_iou, detection_for_iou)
|
||||
|
||||
matches, u_track, u_detection = linear_assignment(dists, thresh=0.5)
|
||||
|
||||
for itracked, idet in matches:
|
||||
track = tracklet_for_iou[itracked]
|
||||
det = detection_for_iou[idet]
|
||||
if track.state == TrackState.Tracked:
|
||||
track.update(detection_for_iou[idet], self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
else:
|
||||
track.re_activate(det, self.frame_id, new_id=False)
|
||||
refind_tracklets.append(track)
|
||||
|
||||
for it in u_track:
|
||||
track = tracklet_for_iou[it]
|
||||
if not track.state == TrackState.Lost:
|
||||
track.mark_lost()
|
||||
lost_tracklets.append(track)
|
||||
|
||||
|
||||
|
||||
'''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
|
||||
detections = [detection_for_iou[i] for i in u_detection]
|
||||
dists = iou_distance(unconfirmed, detections)
|
||||
|
||||
matches, u_unconfirmed, u_detection = linear_assignment(dists, thresh=0.7)
|
||||
|
||||
for itracked, idet in matches:
|
||||
unconfirmed[itracked].update(detections[idet], self.frame_id)
|
||||
activated_tracklets.append(unconfirmed[itracked])
|
||||
for it in u_unconfirmed:
|
||||
track = unconfirmed[it]
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
""" Step 4: Init new tracklets"""
|
||||
for inew in u_detection:
|
||||
track = detections[inew]
|
||||
if track.score < self.det_thresh:
|
||||
continue
|
||||
track.activate(self.frame_id)
|
||||
activated_tracklets.append(track)
|
||||
|
||||
""" Step 5: Update state"""
|
||||
for track in self.lost_tracklets:
|
||||
if self.frame_id - track.end_frame > self.max_time_lost:
|
||||
track.mark_removed()
|
||||
removed_tracklets.append(track)
|
||||
|
||||
# print('Ramained match {} s'.format(t4-t3))
|
||||
|
||||
self.tracked_tracklets = [t for t in self.tracked_tracklets if t.state == TrackState.Tracked]
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, activated_tracklets)
|
||||
self.tracked_tracklets = joint_tracklets(self.tracked_tracklets, refind_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.tracked_tracklets)
|
||||
self.lost_tracklets.extend(lost_tracklets)
|
||||
self.lost_tracklets = sub_tracklets(self.lost_tracklets, self.removed_tracklets)
|
||||
self.removed_tracklets.extend(removed_tracklets)
|
||||
self.tracked_tracklets, self.lost_tracklets = remove_duplicate_tracklets(self.tracked_tracklets, self.lost_tracklets)
|
||||
# get scores of lost tracks
|
||||
output_tracklets = [track for track in self.tracked_tracklets if track.is_activated]
|
||||
|
||||
return output_tracklets
|
||||
|
||||
def gated_metric(self, tracks, dets):
|
||||
"""
|
||||
get cost matrix, firstly calculate apperence cost, then filter by Kalman state.
|
||||
|
||||
tracks: List[STrack]
|
||||
dets: List[STrack]
|
||||
"""
|
||||
apperance_dist = embedding_distance(tracks=tracks, detections=dets, metric='cosine')
|
||||
cost_matrix = self.gate_cost_matrix(apperance_dist, tracks, dets, )
|
||||
return cost_matrix
|
||||
|
||||
def gate_cost_matrix(self, cost_matrix, tracks, dets, max_apperance_thresh=0.15, gated_cost=1e5, only_position=False):
|
||||
"""
|
||||
gate cost matrix by calculating the Kalman state distance and constrainted by
|
||||
0.95 confidence interval of x2 distribution
|
||||
|
||||
cost_matrix: np.ndarray, shape (len(tracks), len(dets))
|
||||
tracks: List[STrack]
|
||||
dets: List[STrack]
|
||||
gated_cost: a very largt const to infeasible associations
|
||||
only_position: use [xc, yc, a, h] as state vector or only use [xc, yc]
|
||||
|
||||
return:
|
||||
updated cost_matirx, np.ndarray
|
||||
"""
|
||||
gating_dim = 2 if only_position else 4
|
||||
gating_threshold = chi2inv95[gating_dim]
|
||||
measurements = np.asarray([Tracklet.tlwh_to_xyah(det.tlwh) for det in dets]) # (len(dets), 4)
|
||||
|
||||
cost_matrix[cost_matrix > max_apperance_thresh] = gated_cost
|
||||
for row, track in enumerate(tracks):
|
||||
gating_distance = track.kalman_filter.gating_distance(measurements, )
|
||||
cost_matrix[row, gating_distance > gating_threshold] = gated_cost
|
||||
|
||||
cost_matrix[row] = self.lambda_ * cost_matrix[row] + (1 - self.lambda_) * gating_distance
|
||||
return cost_matrix
|
||||
|
||||
|
||||
def joint_tracklets(tlista, tlistb):
|
||||
exists = {}
|
||||
res = []
|
||||
for t in tlista:
|
||||
exists[t.track_id] = 1
|
||||
res.append(t)
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if not exists.get(tid, 0):
|
||||
exists[tid] = 1
|
||||
res.append(t)
|
||||
return res
|
||||
|
||||
|
||||
def sub_tracklets(tlista, tlistb):
|
||||
tracklets = {}
|
||||
for t in tlista:
|
||||
tracklets[t.track_id] = t
|
||||
for t in tlistb:
|
||||
tid = t.track_id
|
||||
if tracklets.get(tid, 0):
|
||||
del tracklets[tid]
|
||||
return list(tracklets.values())
|
||||
|
||||
|
||||
def remove_duplicate_tracklets(trackletsa, trackletsb):
|
||||
pdist = iou_distance(trackletsa, trackletsb)
|
||||
pairs = np.where(pdist < 0.15)
|
||||
dupa, dupb = list(), list()
|
||||
for p, q in zip(*pairs):
|
||||
timep = trackletsa[p].frame_id - trackletsa[p].start_frame
|
||||
timeq = trackletsb[q].frame_id - trackletsb[q].start_frame
|
||||
if timep > timeq:
|
||||
dupb.append(q)
|
||||
else:
|
||||
dupa.append(p)
|
||||
resa = [t for i, t in enumerate(trackletsa) if not i in dupa]
|
||||
resb = [t for i, t in enumerate(trackletsb) if not i in dupb]
|
||||
return resa, resb
|
366
yolov7-tracker-example/tracker/trackers/tracklet.py
Normal file
366
yolov7-tracker-example/tracker/trackers/tracklet.py
Normal file
@ -0,0 +1,366 @@
|
||||
"""
|
||||
implements base elements of trajectory
|
||||
"""
|
||||
|
||||
import numpy as np
|
||||
from collections import deque
|
||||
|
||||
from .basetrack import BaseTrack, TrackState
|
||||
from .kalman_filters.bytetrack_kalman import ByteKalman
|
||||
from .kalman_filters.botsort_kalman import BotKalman
|
||||
from .kalman_filters.ocsort_kalman import OCSORTKalman
|
||||
from .kalman_filters.sort_kalman import SORTKalman
|
||||
from .kalman_filters.strongsort_kalman import NSAKalman
|
||||
|
||||
MOTION_MODEL_DICT = {
|
||||
'sort': SORTKalman,
|
||||
'byte': ByteKalman,
|
||||
'bot': BotKalman,
|
||||
'ocsort': OCSORTKalman,
|
||||
'strongsort': NSAKalman,
|
||||
}
|
||||
|
||||
STATE_CONVERT_DICT = {
|
||||
'sort': 'xysa',
|
||||
'byte': 'xyah',
|
||||
'bot': 'xywh',
|
||||
'ocsort': 'xysa',
|
||||
'strongsort': 'xyah'
|
||||
}
|
||||
|
||||
class Tracklet(BaseTrack):
|
||||
def __init__(self, tlwh, score, category, motion='byte'):
|
||||
|
||||
# initial position
|
||||
self._tlwh = np.asarray(tlwh, dtype=np.float)
|
||||
self.is_activated = False
|
||||
|
||||
self.score = score
|
||||
self.category = category
|
||||
|
||||
# kalman
|
||||
self.motion = motion
|
||||
self.kalman_filter = MOTION_MODEL_DICT[motion]()
|
||||
|
||||
self.convert_func = self.__getattribute__('tlwh_to_' + STATE_CONVERT_DICT[motion])
|
||||
|
||||
# init kalman
|
||||
self.kalman_filter.initialize(self.convert_func(self._tlwh))
|
||||
|
||||
def predict(self):
|
||||
self.kalman_filter.predict()
|
||||
self.time_since_update += 1
|
||||
|
||||
def activate(self, frame_id):
|
||||
self.track_id = self.next_id()
|
||||
|
||||
self.state = TrackState.Tracked
|
||||
if frame_id == 1:
|
||||
self.is_activated = True
|
||||
self.frame_id = frame_id
|
||||
self.start_frame = frame_id
|
||||
|
||||
|
||||
def re_activate(self, new_track, frame_id, new_id=False):
|
||||
|
||||
# TODO different convert
|
||||
self.kalman_filter.update(self.convert_func(new_track.tlwh))
|
||||
|
||||
self.state = TrackState.Tracked
|
||||
self.is_activated = True
|
||||
self.frame_id = frame_id
|
||||
if new_id:
|
||||
self.track_id = self.next_id()
|
||||
self.score = new_track.score
|
||||
|
||||
def update(self, new_track, frame_id):
|
||||
self.frame_id = frame_id
|
||||
|
||||
new_tlwh = new_track.tlwh
|
||||
self.score = new_track.score
|
||||
|
||||
self.kalman_filter.update(self.convert_func(new_tlwh))
|
||||
|
||||
self.state = TrackState.Tracked
|
||||
self.is_activated = True
|
||||
|
||||
self.time_since_update = 0
|
||||
|
||||
@property
|
||||
def tlwh(self):
|
||||
"""Get current position in bounding box format `(top left x, top left y,
|
||||
width, height)`.
|
||||
"""
|
||||
return self.__getattribute__(STATE_CONVERT_DICT[self.motion] + '_to_tlwh')()
|
||||
|
||||
def xyah_to_tlwh(self, ):
|
||||
x = self.kalman_filter.kf.x
|
||||
ret = x[:4].copy()
|
||||
ret[2] *= ret[3]
|
||||
ret[:2] -= ret[2:] / 2
|
||||
return ret
|
||||
|
||||
def xywh_to_tlwh(self, ):
|
||||
x = self.kalman_filter.kf.x
|
||||
ret = x[:4].copy()
|
||||
ret[:2] -= ret[2:] / 2
|
||||
return ret
|
||||
|
||||
def xysa_to_tlwh(self, ):
|
||||
x = self.kalman_filter.kf.x
|
||||
ret = x[:4].copy()
|
||||
ret[2] = np.sqrt(x[2] * x[3])
|
||||
ret[3] = x[2] / ret[2]
|
||||
|
||||
ret[:2] -= ret[2:] / 2
|
||||
return ret
|
||||
|
||||
|
||||
class Tracklet_w_reid(Tracklet):
|
||||
"""
|
||||
Tracklet class with reid features, for botsort, deepsort, etc.
|
||||
"""
|
||||
|
||||
def __init__(self, tlwh, score, category, motion='byte',
|
||||
feat=None, feat_history=50):
|
||||
super().__init__(tlwh, score, category, motion)
|
||||
|
||||
self.smooth_feat = None # EMA feature
|
||||
self.curr_feat = None # current feature
|
||||
self.features = deque([], maxlen=feat_history) # all features
|
||||
if feat is not None:
|
||||
self.update_features(feat)
|
||||
|
||||
self.alpha = 0.9
|
||||
|
||||
def update_features(self, feat):
|
||||
feat /= np.linalg.norm(feat)
|
||||
self.curr_feat = feat
|
||||
if self.smooth_feat is None:
|
||||
self.smooth_feat = feat
|
||||
else:
|
||||
self.smooth_feat = self.alpha * self.smooth_feat + (1 - self.alpha) * feat
|
||||
self.features.append(feat)
|
||||
self.smooth_feat /= np.linalg.norm(self.smooth_feat)
|
||||
|
||||
def re_activate(self, new_track, frame_id, new_id=False):
|
||||
|
||||
# TODO different convert
|
||||
if isinstance(self.kalman_filter, NSAKalman):
|
||||
self.kalman_filter.update(self.convert_func(new_track.tlwh), new_track.score)
|
||||
else:
|
||||
self.kalman_filter.update(self.convert_func(new_track.tlwh))
|
||||
|
||||
if new_track.curr_feat is not None:
|
||||
self.update_features(new_track.curr_feat)
|
||||
|
||||
self.state = TrackState.Tracked
|
||||
self.is_activated = True
|
||||
self.frame_id = frame_id
|
||||
if new_id:
|
||||
self.track_id = self.next_id()
|
||||
self.score = new_track.score
|
||||
|
||||
def update(self, new_track, frame_id):
|
||||
self.frame_id = frame_id
|
||||
|
||||
new_tlwh = new_track.tlwh
|
||||
self.score = new_track.score
|
||||
|
||||
if isinstance(self.kalman_filter, NSAKalman):
|
||||
self.kalman_filter.update(self.convert_func(new_tlwh), self.score)
|
||||
else:
|
||||
self.kalman_filter.update(self.convert_func(new_tlwh))
|
||||
|
||||
self.state = TrackState.Tracked
|
||||
self.is_activated = True
|
||||
|
||||
|
||||
if new_track.curr_feat is not None:
|
||||
self.update_features(new_track.curr_feat)
|
||||
|
||||
self.time_since_update = 0
|
||||
|
||||
|
||||
class Tracklet_w_velocity(Tracklet):
|
||||
"""
|
||||
Tracklet class with reid features, for ocsort.
|
||||
"""
|
||||
|
||||
def __init__(self, tlwh, score, category, motion='byte', delta_t=3):
|
||||
super().__init__(tlwh, score, category, motion)
|
||||
|
||||
self.last_observation = np.array([-1, -1, -1, -1, -1]) # placeholder
|
||||
self.observations = dict()
|
||||
self.history_observations = []
|
||||
self.velocity = None
|
||||
self.delta_t = delta_t
|
||||
|
||||
self.age = 0 # mark the age
|
||||
|
||||
@staticmethod
|
||||
def speed_direction(bbox1, bbox2):
|
||||
cx1, cy1 = (bbox1[0] + bbox1[2]) / 2.0, (bbox1[1] + bbox1[3]) / 2.0
|
||||
cx2, cy2 = (bbox2[0] + bbox2[2]) / 2.0, (bbox2[1] + bbox2[3]) / 2.0
|
||||
speed = np.array([cy2 - cy1, cx2 - cx1])
|
||||
norm = np.sqrt((cy2 - cy1)**2 + (cx2 - cx1)**2) + 1e-6
|
||||
return speed / norm
|
||||
|
||||
def predict(self):
|
||||
self.kalman_filter.predict()
|
||||
|
||||
self.age += 1
|
||||
self.time_since_update += 1
|
||||
|
||||
def update(self, new_track, frame_id):
|
||||
self.frame_id = frame_id
|
||||
|
||||
new_tlwh = new_track.tlwh
|
||||
self.score = new_track.score
|
||||
|
||||
self.kalman_filter.update(self.convert_func(new_tlwh))
|
||||
|
||||
self.state = TrackState.Tracked
|
||||
self.is_activated = True
|
||||
self.time_since_update = 0
|
||||
|
||||
# update velocity and history buffer
|
||||
new_tlbr = Tracklet_w_bbox_buffer.tlwh_to_tlbr(new_tlwh)
|
||||
|
||||
if self.last_observation.sum() >= 0: # no previous observation
|
||||
previous_box = None
|
||||
for i in range(self.delta_t):
|
||||
dt = self.delta_t - i
|
||||
if self.age - dt in self.observations:
|
||||
previous_box = self.observations[self.age-dt]
|
||||
break
|
||||
if previous_box is None:
|
||||
previous_box = self.last_observation
|
||||
"""
|
||||
Estimate the track speed direction with observations \Delta t steps away
|
||||
"""
|
||||
self.velocity = self.speed_direction(previous_box, new_tlbr)
|
||||
|
||||
new_observation = np.r_[new_tlbr, new_track.score]
|
||||
self.last_observation = new_observation
|
||||
self.observations[self.age] = new_observation
|
||||
self.history_observations.append(new_observation)
|
||||
|
||||
|
||||
|
||||
|
||||
class Tracklet_w_bbox_buffer(Tracklet):
|
||||
"""
|
||||
Tracklet class with buffer of bbox, for C_BIoU track.
|
||||
"""
|
||||
def __init__(self, tlwh, score, category, motion='byte'):
|
||||
super().__init__(tlwh, score, category, motion)
|
||||
|
||||
# params in motion state
|
||||
self.b1, self.b2, self.n = 0.3, 0.5, 5
|
||||
self.origin_bbox_buffer = deque() # a deque store the original bbox(tlwh) from t - self.n to t, where t is the last time detected
|
||||
self.origin_bbox_buffer.append(self._tlwh)
|
||||
# buffered bbox, two buffer sizes
|
||||
self.buffer_bbox1 = self.get_buffer_bbox(level=1)
|
||||
self.buffer_bbox2 = self.get_buffer_bbox(level=2)
|
||||
# motion state, s^{t + \delta} = o^t + (\delta / n) * \sum_{i=t-n+1}^t(o^i - o^{i-1}) = o^t + (\delta / n) * (o^t - o^{t - n})
|
||||
self.motion_state1 = self.buffer_bbox1.copy()
|
||||
self.motion_state2 = self.buffer_bbox2.copy()
|
||||
|
||||
def get_buffer_bbox(self, level=1, bbox=None):
|
||||
"""
|
||||
get buffered bbox as: (top, left, w, h) -> (top - bw, y - bh, w + 2bw, h + 2bh)
|
||||
level = 1: b = self.b1 level = 2: b = self.b2
|
||||
bbox: if not None, use bbox to calculate buffer_bbox, else use self._tlwh
|
||||
"""
|
||||
assert level in [1, 2], 'level must be 1 or 2'
|
||||
|
||||
b = self.b1 if level == 1 else self.b2
|
||||
|
||||
if bbox is None:
|
||||
buffer_bbox = self._tlwh + np.array([-b*self._tlwh[2], -b*self._tlwh[3], 2*b*self._tlwh[2], 2*b*self._tlwh[3]])
|
||||
else:
|
||||
buffer_bbox = bbox + np.array([-b*bbox[2], -b*bbox[3], 2*b*bbox[2], 2*b*bbox[3]])
|
||||
return np.maximum(0.0, buffer_bbox)
|
||||
|
||||
def re_activate(self, new_track, frame_id, new_id=False):
|
||||
|
||||
# TODO different convert
|
||||
self.kalman_filter.update(self.convert_func(new_track.tlwh))
|
||||
|
||||
self.state = TrackState.Tracked
|
||||
self.is_activated = True
|
||||
self.frame_id = frame_id
|
||||
if new_id:
|
||||
self.track_id = self.next_id()
|
||||
self.score = new_track.score
|
||||
|
||||
self._tlwh = new_track._tlwh
|
||||
# update stored bbox
|
||||
if (len(self.origin_bbox_buffer) > self.n):
|
||||
self.origin_bbox_buffer.popleft()
|
||||
self.origin_bbox_buffer.append(self._tlwh)
|
||||
else:
|
||||
self.origin_bbox_buffer.append(self._tlwh)
|
||||
|
||||
self.buffer_bbox1 = self.get_buffer_bbox(level=1)
|
||||
self.buffer_bbox2 = self.get_buffer_bbox(level=2)
|
||||
self.motion_state1 = self.buffer_bbox1.copy()
|
||||
self.motion_state2 = self.buffer_bbox2.copy()
|
||||
|
||||
def update(self, new_track, frame_id):
|
||||
self.frame_id = frame_id
|
||||
|
||||
new_tlwh = new_track.tlwh
|
||||
self.score = new_track.score
|
||||
|
||||
self.kalman_filter.update(self.convert_func(new_tlwh))
|
||||
|
||||
self.state = TrackState.Tracked
|
||||
self.is_activated = True
|
||||
|
||||
self.time_since_update = 0
|
||||
|
||||
# update stored bbox
|
||||
if (len(self.origin_bbox_buffer) > self.n):
|
||||
self.origin_bbox_buffer.popleft()
|
||||
self.origin_bbox_buffer.append(new_tlwh)
|
||||
else:
|
||||
self.origin_bbox_buffer.append(new_tlwh)
|
||||
|
||||
# update motion state
|
||||
if self.time_since_update: # have some unmatched frames
|
||||
if len(self.origin_bbox_buffer) < self.n:
|
||||
self.motion_state1 = self.get_buffer_bbox(level=1, bbox=new_tlwh)
|
||||
self.motion_state2 = self.get_buffer_bbox(level=2, bbox=new_tlwh)
|
||||
else: # s^{t + \delta} = o^t + (\delta / n) * (o^t - o^{t - n})
|
||||
motion_state = self.origin_bbox_buffer[-1] + \
|
||||
(self.time_since_update / self.n) * (self.origin_bbox_buffer[-1] - self.origin_bbox_buffer[0])
|
||||
self.motion_state1 = self.get_buffer_bbox(level=1, bbox=motion_state)
|
||||
self.motion_state2 = self.get_buffer_bbox(level=2, bbox=motion_state)
|
||||
|
||||
else: # no unmatched frames, use current detection as motion state
|
||||
self.motion_state1 = self.get_buffer_bbox(level=1, bbox=new_tlwh)
|
||||
self.motion_state2 = self.get_buffer_bbox(level=2, bbox=new_tlwh)
|
||||
|
||||
|
||||
class Tracklet_w_depth(Tracklet):
|
||||
"""
|
||||
tracklet with depth info (i.e., 2000 - y2), for SparseTrack
|
||||
"""
|
||||
|
||||
def __init__(self, tlwh, score, category, motion='byte'):
|
||||
super().__init__(tlwh, score, category, motion)
|
||||
|
||||
|
||||
@property
|
||||
# @jit(nopython=True)
|
||||
def deep_vec(self):
|
||||
"""Convert bounding box to format `((top left, bottom right)`, i.e.,
|
||||
`(top left, bottom right)`.
|
||||
"""
|
||||
ret = self.tlwh.copy()
|
||||
cx = ret[0] + 0.5 * ret[2]
|
||||
y2 = ret[1] + ret[3]
|
||||
lendth = 2000 - y2
|
||||
return np.asarray([cx, y2, lendth], dtype=np.float)
|
5
yolov7-tracker-example/tracker/trackeval/__init__.py
Normal file
5
yolov7-tracker-example/tracker/trackeval/__init__.py
Normal file
@ -0,0 +1,5 @@
|
||||
from .eval import Evaluator
|
||||
from . import datasets
|
||||
from . import metrics
|
||||
from . import plotting
|
||||
from . import utils
|
65
yolov7-tracker-example/tracker/trackeval/_timing.py
Normal file
65
yolov7-tracker-example/tracker/trackeval/_timing.py
Normal file
@ -0,0 +1,65 @@
|
||||
from functools import wraps
|
||||
from time import perf_counter
|
||||
import inspect
|
||||
|
||||
DO_TIMING = False
|
||||
DISPLAY_LESS_PROGRESS = False
|
||||
timer_dict = {}
|
||||
counter = 0
|
||||
|
||||
|
||||
def time(f):
|
||||
@wraps(f)
|
||||
def wrap(*args, **kw):
|
||||
if DO_TIMING:
|
||||
# Run function with timing
|
||||
ts = perf_counter()
|
||||
result = f(*args, **kw)
|
||||
te = perf_counter()
|
||||
tt = te-ts
|
||||
|
||||
# Get function name
|
||||
arg_names = inspect.getfullargspec(f)[0]
|
||||
if arg_names[0] == 'self' and DISPLAY_LESS_PROGRESS:
|
||||
return result
|
||||
elif arg_names[0] == 'self':
|
||||
method_name = type(args[0]).__name__ + '.' + f.__name__
|
||||
else:
|
||||
method_name = f.__name__
|
||||
|
||||
# Record accumulative time in each function for analysis
|
||||
if method_name in timer_dict.keys():
|
||||
timer_dict[method_name] += tt
|
||||
else:
|
||||
timer_dict[method_name] = tt
|
||||
|
||||
# If code is finished, display timing summary
|
||||
if method_name == "Evaluator.evaluate":
|
||||
print("")
|
||||
print("Timing analysis:")
|
||||
for key, value in timer_dict.items():
|
||||
print('%-70s %2.4f sec' % (key, value))
|
||||
else:
|
||||
# Get function argument values for printing special arguments of interest
|
||||
arg_titles = ['tracker', 'seq', 'cls']
|
||||
arg_vals = []
|
||||
for i, a in enumerate(arg_names):
|
||||
if a in arg_titles:
|
||||
arg_vals.append(args[i])
|
||||
arg_text = '(' + ', '.join(arg_vals) + ')'
|
||||
|
||||
# Display methods and functions with different indentation.
|
||||
if arg_names[0] == 'self':
|
||||
print('%-74s %2.4f sec' % (' '*4 + method_name + arg_text, tt))
|
||||
elif arg_names[0] == 'test':
|
||||
pass
|
||||
else:
|
||||
global counter
|
||||
counter += 1
|
||||
print('%i %-70s %2.4f sec' % (counter, method_name + arg_text, tt))
|
||||
|
||||
return result
|
||||
else:
|
||||
# If config["TIME_PROGRESS"] is false, or config["USE_PARALLEL"] is true, run functions normally without timing.
|
||||
return f(*args, **kw)
|
||||
return wrap
|
@ -0,0 +1,6 @@
|
||||
import baseline_utils
|
||||
import stp
|
||||
import non_overlap
|
||||
import pascal_colormap
|
||||
import thresholder
|
||||
import vizualize
|
@ -0,0 +1,321 @@
|
||||
|
||||
import os
|
||||
import csv
|
||||
import numpy as np
|
||||
from copy import deepcopy
|
||||
from PIL import Image
|
||||
from pycocotools import mask as mask_utils
|
||||
from scipy.optimize import linear_sum_assignment
|
||||
from trackeval.baselines.pascal_colormap import pascal_colormap
|
||||
|
||||
|
||||
def load_seq(file_to_load):
|
||||
""" Load input data from file in RobMOTS format (e.g. provided detections).
|
||||
Returns: Data object with the following structure (see STP :
|
||||
data['cls'][t] = {'ids', 'scores', 'im_hs', 'im_ws', 'mask_rles'}
|
||||
"""
|
||||
fp = open(file_to_load)
|
||||
dialect = csv.Sniffer().sniff(fp.readline(), delimiters=' ')
|
||||
dialect.skipinitialspace = True
|
||||
fp.seek(0)
|
||||
reader = csv.reader(fp, dialect)
|
||||
read_data = {}
|
||||
num_timesteps = 0
|
||||
for i, row in enumerate(reader):
|
||||
if row[-1] in '':
|
||||
row = row[:-1]
|
||||
t = int(row[0])
|
||||
cid = row[1]
|
||||
c = int(row[2])
|
||||
s = row[3]
|
||||
h = row[4]
|
||||
w = row[5]
|
||||
rle = row[6]
|
||||
|
||||
if t >= num_timesteps:
|
||||
num_timesteps = t + 1
|
||||
|
||||
if c in read_data.keys():
|
||||
if t in read_data[c].keys():
|
||||
read_data[c][t]['ids'].append(cid)
|
||||
read_data[c][t]['scores'].append(s)
|
||||
read_data[c][t]['im_hs'].append(h)
|
||||
read_data[c][t]['im_ws'].append(w)
|
||||
read_data[c][t]['mask_rles'].append(rle)
|
||||
else:
|
||||
read_data[c][t] = {}
|
||||
read_data[c][t]['ids'] = [cid]
|
||||
read_data[c][t]['scores'] = [s]
|
||||
read_data[c][t]['im_hs'] = [h]
|
||||
read_data[c][t]['im_ws'] = [w]
|
||||
read_data[c][t]['mask_rles'] = [rle]
|
||||
else:
|
||||
read_data[c] = {t: {}}
|
||||
read_data[c][t]['ids'] = [cid]
|
||||
read_data[c][t]['scores'] = [s]
|
||||
read_data[c][t]['im_hs'] = [h]
|
||||
read_data[c][t]['im_ws'] = [w]
|
||||
read_data[c][t]['mask_rles'] = [rle]
|
||||
fp.close()
|
||||
|
||||
data = {}
|
||||
for c in read_data.keys():
|
||||
data[c] = [{} for _ in range(num_timesteps)]
|
||||
for t in range(num_timesteps):
|
||||
if t in read_data[c].keys():
|
||||
data[c][t]['ids'] = np.atleast_1d(read_data[c][t]['ids']).astype(int)
|
||||
data[c][t]['scores'] = np.atleast_1d(read_data[c][t]['scores']).astype(float)
|
||||
data[c][t]['im_hs'] = np.atleast_1d(read_data[c][t]['im_hs']).astype(int)
|
||||
data[c][t]['im_ws'] = np.atleast_1d(read_data[c][t]['im_ws']).astype(int)
|
||||
data[c][t]['mask_rles'] = np.atleast_1d(read_data[c][t]['mask_rles']).astype(str)
|
||||
else:
|
||||
data[c][t]['ids'] = np.empty(0).astype(int)
|
||||
data[c][t]['scores'] = np.empty(0).astype(float)
|
||||
data[c][t]['im_hs'] = np.empty(0).astype(int)
|
||||
data[c][t]['im_ws'] = np.empty(0).astype(int)
|
||||
data[c][t]['mask_rles'] = np.empty(0).astype(str)
|
||||
return data
|
||||
|
||||
|
||||
def threshold(tdata, thresh):
|
||||
""" Removes detections below a certian threshold ('thresh') score. """
|
||||
new_data = {}
|
||||
to_keep = tdata['scores'] > thresh
|
||||
for field in ['ids', 'scores', 'im_hs', 'im_ws', 'mask_rles']:
|
||||
new_data[field] = tdata[field][to_keep]
|
||||
return new_data
|
||||
|
||||
|
||||
def create_coco_mask(mask_rles, im_hs, im_ws):
|
||||
""" Converts mask as rle text (+ height and width) to encoded version used by pycocotools. """
|
||||
coco_masks = [{'size': [h, w], 'counts': m.encode(encoding='UTF-8')}
|
||||
for h, w, m in zip(im_hs, im_ws, mask_rles)]
|
||||
return coco_masks
|
||||
|
||||
|
||||
def mask_iou(mask_rles1, mask_rles2, im_hs, im_ws, do_ioa=0):
|
||||
""" Calculate mask IoU between two masks.
|
||||
Further allows 'intersection over area' instead of IoU (over the area of mask_rle1).
|
||||
Allows either to pass in 1 boolean for do_ioa for all mask_rles2 or also one for each mask_rles2.
|
||||
It is recommended that mask_rles1 is a detection and mask_rles2 is a groundtruth.
|
||||
"""
|
||||
coco_masks1 = create_coco_mask(mask_rles1, im_hs, im_ws)
|
||||
coco_masks2 = create_coco_mask(mask_rles2, im_hs, im_ws)
|
||||
|
||||
if not hasattr(do_ioa, "__len__"):
|
||||
do_ioa = [do_ioa]*len(coco_masks2)
|
||||
assert(len(coco_masks2) == len(do_ioa))
|
||||
if len(coco_masks1) == 0 or len(coco_masks2) == 0:
|
||||
iou = np.zeros(len(coco_masks1), len(coco_masks2))
|
||||
else:
|
||||
iou = mask_utils.iou(coco_masks1, coco_masks2, do_ioa)
|
||||
return iou
|
||||
|
||||
|
||||
def sort_by_score(t_data):
|
||||
""" Sorts data by score """
|
||||
sort_index = np.argsort(t_data['scores'])[::-1]
|
||||
for k in t_data.keys():
|
||||
t_data[k] = t_data[k][sort_index]
|
||||
return t_data
|
||||
|
||||
|
||||
def mask_NMS(t_data, nms_threshold=0.5, already_sorted=False):
|
||||
""" Remove redundant masks by performing non-maximum suppression (NMS) """
|
||||
|
||||
# Sort by score
|
||||
if not already_sorted:
|
||||
t_data = sort_by_score(t_data)
|
||||
|
||||
# Calculate the mask IoU between all detections in the timestep.
|
||||
mask_ious_all = mask_iou(t_data['mask_rles'], t_data['mask_rles'], t_data['im_hs'], t_data['im_ws'])
|
||||
|
||||
# Determine which masks NMS should remove
|
||||
# (those overlapping greater than nms_threshold with another mask that has a higher score)
|
||||
num_dets = len(t_data['mask_rles'])
|
||||
to_remove = [False for _ in range(num_dets)]
|
||||
for i in range(num_dets):
|
||||
if not to_remove[i]:
|
||||
for j in range(i + 1, num_dets):
|
||||
if mask_ious_all[i, j] > nms_threshold:
|
||||
to_remove[j] = True
|
||||
|
||||
# Remove detections which should be removed
|
||||
to_keep = np.logical_not(to_remove)
|
||||
for k in t_data.keys():
|
||||
t_data[k] = t_data[k][to_keep]
|
||||
|
||||
return t_data
|
||||
|
||||
|
||||
def non_overlap(t_data, already_sorted=False):
|
||||
""" Enforces masks to be non-overlapping in an image, does this by putting masks 'on top of one another',
|
||||
such that higher score masks 'occlude' and thus remove parts of lower scoring masks.
|
||||
|
||||
Help wanted: if anyone knows a way to do this WITHOUT converting the RLE to the np.array let me know, because that
|
||||
would be MUCH more efficient. (I have tried, but haven't yet had success).
|
||||
"""
|
||||
|
||||
# Sort by score
|
||||
if not already_sorted:
|
||||
t_data = sort_by_score(t_data)
|
||||
|
||||
# Get coco masks
|
||||
coco_masks = create_coco_mask(t_data['mask_rles'], t_data['im_hs'], t_data['im_ws'])
|
||||
|
||||
# Create a single np.array to hold all of the non-overlapping mask
|
||||
masks_array = np.zeros((t_data['im_hs'][0], t_data['im_ws'][0]), 'uint8')
|
||||
|
||||
# Decode each mask into a np.array, and place it into the overall array for the whole frame.
|
||||
# Since masks with the lowest score are placed first, they are 'partially overridden' by masks with a higher score
|
||||
# if they overlap.
|
||||
for i, mask in enumerate(coco_masks[::-1]):
|
||||
masks_array[mask_utils.decode(mask).astype('bool')] = i + 1
|
||||
|
||||
# Encode the resulting np.array back into a set of coco_masks which are now non-overlapping.
|
||||
num_dets = len(coco_masks)
|
||||
for i, j in enumerate(range(1, num_dets + 1)[::-1]):
|
||||
coco_masks[i] = mask_utils.encode(np.asfortranarray(masks_array == j, dtype=np.uint8))
|
||||
|
||||
# Convert from coco_mask back into our mask_rle format.
|
||||
t_data['mask_rles'] = [m['counts'].decode("utf-8") for m in coco_masks]
|
||||
|
||||
return t_data
|
||||
|
||||
|
||||
def masks2boxes(mask_rles, im_hs, im_ws):
|
||||
""" Extracts bounding boxes which surround a set of masks. """
|
||||
coco_masks = create_coco_mask(mask_rles, im_hs, im_ws)
|
||||
boxes = np.array([mask_utils.toBbox(x) for x in coco_masks])
|
||||
if len(boxes) == 0:
|
||||
boxes = np.empty((0, 4))
|
||||
return boxes
|
||||
|
||||
|
||||
def box_iou(bboxes1, bboxes2, box_format='xywh', do_ioa=False, do_giou=False):
|
||||
""" Calculates the IOU (intersection over union) between two arrays of boxes.
|
||||
Allows variable box formats ('xywh' and 'x0y0x1y1').
|
||||
If do_ioa (intersection over area), then calculates the intersection over the area of boxes1 - this is commonly
|
||||
used to determine if detections are within crowd ignore region.
|
||||
If do_giou (generalized intersection over union, then calculates giou.
|
||||
"""
|
||||
if len(bboxes1) == 0 or len(bboxes2) == 0:
|
||||
ious = np.zeros((len(bboxes1), len(bboxes2)))
|
||||
return ious
|
||||
if box_format in 'xywh':
|
||||
# layout: (x0, y0, w, h)
|
||||
bboxes1 = deepcopy(bboxes1)
|
||||
bboxes2 = deepcopy(bboxes2)
|
||||
|
||||
bboxes1[:, 2] = bboxes1[:, 0] + bboxes1[:, 2]
|
||||
bboxes1[:, 3] = bboxes1[:, 1] + bboxes1[:, 3]
|
||||
bboxes2[:, 2] = bboxes2[:, 0] + bboxes2[:, 2]
|
||||
bboxes2[:, 3] = bboxes2[:, 1] + bboxes2[:, 3]
|
||||
elif box_format not in 'x0y0x1y1':
|
||||
raise (Exception('box_format %s is not implemented' % box_format))
|
||||
|
||||
# layout: (x0, y0, x1, y1)
|
||||
min_ = np.minimum(bboxes1[:, np.newaxis, :], bboxes2[np.newaxis, :, :])
|
||||
max_ = np.maximum(bboxes1[:, np.newaxis, :], bboxes2[np.newaxis, :, :])
|
||||
intersection = np.maximum(min_[..., 2] - max_[..., 0], 0) * np.maximum(min_[..., 3] - max_[..., 1], 0)
|
||||
area1 = (bboxes1[..., 2] - bboxes1[..., 0]) * (bboxes1[..., 3] - bboxes1[..., 1])
|
||||
|
||||
if do_ioa:
|
||||
ioas = np.zeros_like(intersection)
|
||||
valid_mask = area1 > 0 + np.finfo('float').eps
|
||||
ioas[valid_mask, :] = intersection[valid_mask, :] / area1[valid_mask][:, np.newaxis]
|
||||
|
||||
return ioas
|
||||
else:
|
||||
area2 = (bboxes2[..., 2] - bboxes2[..., 0]) * (bboxes2[..., 3] - bboxes2[..., 1])
|
||||
union = area1[:, np.newaxis] + area2[np.newaxis, :] - intersection
|
||||
intersection[area1 <= 0 + np.finfo('float').eps, :] = 0
|
||||
intersection[:, area2 <= 0 + np.finfo('float').eps] = 0
|
||||
intersection[union <= 0 + np.finfo('float').eps] = 0
|
||||
union[union <= 0 + np.finfo('float').eps] = 1
|
||||
ious = intersection / union
|
||||
|
||||
if do_giou:
|
||||
enclosing_area = np.maximum(max_[..., 2] - min_[..., 0], 0) * np.maximum(max_[..., 3] - min_[..., 1], 0)
|
||||
eps = 1e-7
|
||||
# giou
|
||||
ious = ious - ((enclosing_area - union) / (enclosing_area + eps))
|
||||
|
||||
return ious
|
||||
|
||||
|
||||
def match(match_scores):
|
||||
match_rows, match_cols = linear_sum_assignment(-match_scores)
|
||||
return match_rows, match_cols
|
||||
|
||||
|
||||
def write_seq(output_data, out_file):
|
||||
out_loc = os.path.dirname(out_file)
|
||||
if not os.path.exists(out_loc):
|
||||
os.makedirs(out_loc, exist_ok=True)
|
||||
fp = open(out_file, 'w', newline='')
|
||||
writer = csv.writer(fp, delimiter=' ')
|
||||
for row in output_data:
|
||||
writer.writerow(row)
|
||||
fp.close()
|
||||
|
||||
|
||||
def combine_classes(data):
|
||||
""" Converts data from a class-separated to a class-combined format.
|
||||
Input format: data['cls'][t] = {'ids', 'scores', 'im_hs', 'im_ws', 'mask_rles'}
|
||||
Output format: data[t] = {'ids', 'scores', 'im_hs', 'im_ws', 'mask_rles', 'cls'}
|
||||
"""
|
||||
output_data = [{} for _ in list(data.values())[0]]
|
||||
for cls, cls_data in data.items():
|
||||
for timestep, t_data in enumerate(cls_data):
|
||||
for k in t_data.keys():
|
||||
if k in output_data[timestep].keys():
|
||||
output_data[timestep][k] += list(t_data[k])
|
||||
else:
|
||||
output_data[timestep][k] = list(t_data[k])
|
||||
if 'cls' in output_data[timestep].keys():
|
||||
output_data[timestep]['cls'] += [cls]*len(output_data[timestep]['ids'])
|
||||
else:
|
||||
output_data[timestep]['cls'] = [cls]*len(output_data[timestep]['ids'])
|
||||
|
||||
for timestep, t_data in enumerate(output_data):
|
||||
for k in t_data.keys():
|
||||
output_data[timestep][k] = np.array(output_data[timestep][k])
|
||||
|
||||
return output_data
|
||||
|
||||
|
||||
def save_as_png(t_data, out_file, im_h, im_w):
|
||||
""" Save a set of segmentation masks into a PNG format, the same as used for the DAVIS dataset."""
|
||||
|
||||
if len(t_data['mask_rles']) > 0:
|
||||
coco_masks = create_coco_mask(t_data['mask_rles'], t_data['im_hs'], t_data['im_ws'])
|
||||
|
||||
list_of_np_masks = [mask_utils.decode(mask) for mask in coco_masks]
|
||||
|
||||
png = np.zeros((t_data['im_hs'][0], t_data['im_ws'][0]))
|
||||
for mask, c_id in zip(list_of_np_masks, t_data['ids']):
|
||||
png[mask.astype("bool")] = c_id + 1
|
||||
else:
|
||||
png = np.zeros((im_h, im_w))
|
||||
|
||||
if not os.path.exists(os.path.dirname(out_file)):
|
||||
os.makedirs(os.path.dirname(out_file))
|
||||
|
||||
colmap = (np.array(pascal_colormap) * 255).round().astype("uint8")
|
||||
palimage = Image.new('P', (16, 16))
|
||||
palimage.putpalette(colmap)
|
||||
im = Image.fromarray(np.squeeze(png.astype("uint8")))
|
||||
im2 = im.quantize(palette=palimage)
|
||||
im2.save(out_file)
|
||||
|
||||
|
||||
def get_frame_size(data):
|
||||
""" Gets frame height and width from data. """
|
||||
for cls, cls_data in data.items():
|
||||
for timestep, t_data in enumerate(cls_data):
|
||||
if len(t_data['im_hs'] > 0):
|
||||
im_h = t_data['im_hs'][0]
|
||||
im_w = t_data['im_ws'][0]
|
||||
return im_h, im_w
|
||||
return None
|
@ -0,0 +1,92 @@
|
||||
"""
|
||||
Non-Overlap: Code to take in a set of raw detections and produce a set of non-overlapping detections from it.
|
||||
|
||||
Author: Jonathon Luiten
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from multiprocessing.pool import Pool
|
||||
from multiprocessing import freeze_support
|
||||
|
||||
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')))
|
||||
from trackeval.baselines import baseline_utils as butils
|
||||
from trackeval.utils import get_code_path
|
||||
|
||||
code_path = get_code_path()
|
||||
config = {
|
||||
'INPUT_FOL': os.path.join(code_path, 'data/detections/rob_mots/{split}/raw_supplied/data/'),
|
||||
'OUTPUT_FOL': os.path.join(code_path, 'data/detections/rob_mots/{split}/non_overlap_supplied/data/'),
|
||||
'SPLIT': 'train', # valid: 'train', 'val', 'test'.
|
||||
'Benchmarks': None, # If None, all benchmarks in SPLIT.
|
||||
|
||||
'Num_Parallel_Cores': None, # If None, run without parallel.
|
||||
|
||||
'THRESHOLD_NMS_MASK_IOU': 0.5,
|
||||
}
|
||||
|
||||
|
||||
def do_sequence(seq_file):
|
||||
|
||||
# Load input data from file (e.g. provided detections)
|
||||
# data format: data['cls'][t] = {'ids', 'scores', 'im_hs', 'im_ws', 'mask_rles'}
|
||||
data = butils.load_seq(seq_file)
|
||||
|
||||
# Converts data from a class-separated to a class-combined format.
|
||||
# data[t] = {'ids', 'scores', 'im_hs', 'im_ws', 'mask_rles', 'cls'}
|
||||
data = butils.combine_classes(data)
|
||||
|
||||
# Where to accumulate output data for writing out
|
||||
output_data = []
|
||||
|
||||
# Run for each timestep.
|
||||
for timestep, t_data in enumerate(data):
|
||||
|
||||
# Remove redundant masks by performing non-maximum suppression (NMS)
|
||||
t_data = butils.mask_NMS(t_data, nms_threshold=config['THRESHOLD_NMS_MASK_IOU'])
|
||||
|
||||
# Perform non-overlap, to get non_overlapping masks.
|
||||
t_data = butils.non_overlap(t_data, already_sorted=True)
|
||||
|
||||
# Save result in output format to write to file later.
|
||||
# Output Format = [timestep ID class score im_h im_w mask_RLE]
|
||||
for i in range(len(t_data['ids'])):
|
||||
row = [timestep, int(t_data['ids'][i]), t_data['cls'][i], t_data['scores'][i], t_data['im_hs'][i],
|
||||
t_data['im_ws'][i], t_data['mask_rles'][i]]
|
||||
output_data.append(row)
|
||||
|
||||
# Write results to file
|
||||
out_file = seq_file.replace(config['INPUT_FOL'].format(split=config['SPLIT']),
|
||||
config['OUTPUT_FOL'].format(split=config['SPLIT']))
|
||||
butils.write_seq(output_data, out_file)
|
||||
|
||||
print('DONE:', seq_file)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
# Required to fix bug in multiprocessing on windows.
|
||||
freeze_support()
|
||||
|
||||
# Obtain list of sequences to run tracker for.
|
||||
if config['Benchmarks']:
|
||||
benchmarks = config['Benchmarks']
|
||||
else:
|
||||
benchmarks = ['davis_unsupervised', 'kitti_mots', 'youtube_vis', 'ovis', 'bdd_mots', 'tao']
|
||||
if config['SPLIT'] != 'train':
|
||||
benchmarks += ['waymo', 'mots_challenge']
|
||||
seqs_todo = []
|
||||
for bench in benchmarks:
|
||||
bench_fol = os.path.join(config['INPUT_FOL'].format(split=config['SPLIT']), bench)
|
||||
seqs_todo += [os.path.join(bench_fol, seq) for seq in os.listdir(bench_fol)]
|
||||
|
||||
# Run in parallel
|
||||
if config['Num_Parallel_Cores']:
|
||||
with Pool(config['Num_Parallel_Cores']) as pool:
|
||||
results = pool.map(do_sequence, seqs_todo)
|
||||
|
||||
# Run in series
|
||||
else:
|
||||
for seq_todo in seqs_todo:
|
||||
do_sequence(seq_todo)
|
||||
|
@ -0,0 +1,257 @@
|
||||
pascal_colormap = [
|
||||
0 , 0, 0,
|
||||
0.5020, 0, 0,
|
||||
0, 0.5020, 0,
|
||||
0.5020, 0.5020, 0,
|
||||
0, 0, 0.5020,
|
||||
0.5020, 0, 0.5020,
|
||||
0, 0.5020, 0.5020,
|
||||
0.5020, 0.5020, 0.5020,
|
||||
0.2510, 0, 0,
|
||||
0.7529, 0, 0,
|
||||
0.2510, 0.5020, 0,
|
||||
0.7529, 0.5020, 0,
|
||||
0.2510, 0, 0.5020,
|
||||
0.7529, 0, 0.5020,
|
||||
0.2510, 0.5020, 0.5020,
|
||||
0.7529, 0.5020, 0.5020,
|
||||
0, 0.2510, 0,
|
||||
0.5020, 0.2510, 0,
|
||||
0, 0.7529, 0,
|
||||
0.5020, 0.7529, 0,
|
||||
0, 0.2510, 0.5020,
|
||||
0.5020, 0.2510, 0.5020,
|
||||
0, 0.7529, 0.5020,
|
||||
0.5020, 0.7529, 0.5020,
|
||||
0.2510, 0.2510, 0,
|
||||
0.7529, 0.2510, 0,
|
||||
0.2510, 0.7529, 0,
|
||||
0.7529, 0.7529, 0,
|
||||
0.2510, 0.2510, 0.5020,
|
||||
0.7529, 0.2510, 0.5020,
|
||||
0.2510, 0.7529, 0.5020,
|
||||
0.7529, 0.7529, 0.5020,
|
||||
0, 0, 0.2510,
|
||||
0.5020, 0, 0.2510,
|
||||
0, 0.5020, 0.2510,
|
||||
0.5020, 0.5020, 0.2510,
|
||||
0, 0, 0.7529,
|
||||
0.5020, 0, 0.7529,
|
||||
0, 0.5020, 0.7529,
|
||||
0.5020, 0.5020, 0.7529,
|
||||
0.2510, 0, 0.2510,
|
||||
0.7529, 0, 0.2510,
|
||||
0.2510, 0.5020, 0.2510,
|
||||
0.7529, 0.5020, 0.2510,
|
||||
0.2510, 0, 0.7529,
|
||||
0.7529, 0, 0.7529,
|
||||
0.2510, 0.5020, 0.7529,
|
||||
0.7529, 0.5020, 0.7529,
|
||||
0, 0.2510, 0.2510,
|
||||
0.5020, 0.2510, 0.2510,
|
||||
0, 0.7529, 0.2510,
|
||||
0.5020, 0.7529, 0.2510,
|
||||
0, 0.2510, 0.7529,
|
||||
0.5020, 0.2510, 0.7529,
|
||||
0, 0.7529, 0.7529,
|
||||
0.5020, 0.7529, 0.7529,
|
||||
0.2510, 0.2510, 0.2510,
|
||||
0.7529, 0.2510, 0.2510,
|
||||
0.2510, 0.7529, 0.2510,
|
||||
0.7529, 0.7529, 0.2510,
|
||||
0.2510, 0.2510, 0.7529,
|
||||
0.7529, 0.2510, 0.7529,
|
||||
0.2510, 0.7529, 0.7529,
|
||||
0.7529, 0.7529, 0.7529,
|
||||
0.1255, 0, 0,
|
||||
0.6275, 0, 0,
|
||||
0.1255, 0.5020, 0,
|
||||
0.6275, 0.5020, 0,
|
||||
0.1255, 0, 0.5020,
|
||||
0.6275, 0, 0.5020,
|
||||
0.1255, 0.5020, 0.5020,
|
||||
0.6275, 0.5020, 0.5020,
|
||||
0.3765, 0, 0,
|
||||
0.8784, 0, 0,
|
||||
0.3765, 0.5020, 0,
|
||||
0.8784, 0.5020, 0,
|
||||
0.3765, 0, 0.5020,
|
||||
0.8784, 0, 0.5020,
|
||||
0.3765, 0.5020, 0.5020,
|
||||
0.8784, 0.5020, 0.5020,
|
||||
0.1255, 0.2510, 0,
|
||||
0.6275, 0.2510, 0,
|
||||
0.1255, 0.7529, 0,
|
||||
0.6275, 0.7529, 0,
|
||||
0.1255, 0.2510, 0.5020,
|
||||
0.6275, 0.2510, 0.5020,
|
||||
0.1255, 0.7529, 0.5020,
|
||||
0.6275, 0.7529, 0.5020,
|
||||
0.3765, 0.2510, 0,
|
||||
0.8784, 0.2510, 0,
|
||||
0.3765, 0.7529, 0,
|
||||
0.8784, 0.7529, 0,
|
||||
0.3765, 0.2510, 0.5020,
|
||||
0.8784, 0.2510, 0.5020,
|
||||
0.3765, 0.7529, 0.5020,
|
||||
0.8784, 0.7529, 0.5020,
|
||||
0.1255, 0, 0.2510,
|
||||
0.6275, 0, 0.2510,
|
||||
0.1255, 0.5020, 0.2510,
|
||||
0.6275, 0.5020, 0.2510,
|
||||
0.1255, 0, 0.7529,
|
||||
0.6275, 0, 0.7529,
|
||||
0.1255, 0.5020, 0.7529,
|
||||
0.6275, 0.5020, 0.7529,
|
||||
0.3765, 0, 0.2510,
|
||||
0.8784, 0, 0.2510,
|
||||
0.3765, 0.5020, 0.2510,
|
||||
0.8784, 0.5020, 0.2510,
|
||||
0.3765, 0, 0.7529,
|
||||
0.8784, 0, 0.7529,
|
||||
0.3765, 0.5020, 0.7529,
|
||||
0.8784, 0.5020, 0.7529,
|
||||
0.1255, 0.2510, 0.2510,
|
||||
0.6275, 0.2510, 0.2510,
|
||||
0.1255, 0.7529, 0.2510,
|
||||
0.6275, 0.7529, 0.2510,
|
||||
0.1255, 0.2510, 0.7529,
|
||||
0.6275, 0.2510, 0.7529,
|
||||
0.1255, 0.7529, 0.7529,
|
||||
0.6275, 0.7529, 0.7529,
|
||||
0.3765, 0.2510, 0.2510,
|
||||
0.8784, 0.2510, 0.2510,
|
||||
0.3765, 0.7529, 0.2510,
|
||||
0.8784, 0.7529, 0.2510,
|
||||
0.3765, 0.2510, 0.7529,
|
||||
0.8784, 0.2510, 0.7529,
|
||||
0.3765, 0.7529, 0.7529,
|
||||
0.8784, 0.7529, 0.7529,
|
||||
0, 0.1255, 0,
|
||||
0.5020, 0.1255, 0,
|
||||
0, 0.6275, 0,
|
||||
0.5020, 0.6275, 0,
|
||||
0, 0.1255, 0.5020,
|
||||
0.5020, 0.1255, 0.5020,
|
||||
0, 0.6275, 0.5020,
|
||||
0.5020, 0.6275, 0.5020,
|
||||
0.2510, 0.1255, 0,
|
||||
0.7529, 0.1255, 0,
|
||||
0.2510, 0.6275, 0,
|
||||
0.7529, 0.6275, 0,
|
||||
0.2510, 0.1255, 0.5020,
|
||||
0.7529, 0.1255, 0.5020,
|
||||
0.2510, 0.6275, 0.5020,
|
||||
0.7529, 0.6275, 0.5020,
|
||||
0, 0.3765, 0,
|
||||
0.5020, 0.3765, 0,
|
||||
0, 0.8784, 0,
|
||||
0.5020, 0.8784, 0,
|
||||
0, 0.3765, 0.5020,
|
||||
0.5020, 0.3765, 0.5020,
|
||||
0, 0.8784, 0.5020,
|
||||
0.5020, 0.8784, 0.5020,
|
||||
0.2510, 0.3765, 0,
|
||||
0.7529, 0.3765, 0,
|
||||
0.2510, 0.8784, 0,
|
||||
0.7529, 0.8784, 0,
|
||||
0.2510, 0.3765, 0.5020,
|
||||
0.7529, 0.3765, 0.5020,
|
||||
0.2510, 0.8784, 0.5020,
|
||||
0.7529, 0.8784, 0.5020,
|
||||
0, 0.1255, 0.2510,
|
||||
0.5020, 0.1255, 0.2510,
|
||||
0, 0.6275, 0.2510,
|
||||
0.5020, 0.6275, 0.2510,
|
||||
0, 0.1255, 0.7529,
|
||||
0.5020, 0.1255, 0.7529,
|
||||
0, 0.6275, 0.7529,
|
||||
0.5020, 0.6275, 0.7529,
|
||||
0.2510, 0.1255, 0.2510,
|
||||
0.7529, 0.1255, 0.2510,
|
||||
0.2510, 0.6275, 0.2510,
|
||||
0.7529, 0.6275, 0.2510,
|
||||
0.2510, 0.1255, 0.7529,
|
||||
0.7529, 0.1255, 0.7529,
|
||||
0.2510, 0.6275, 0.7529,
|
||||
0.7529, 0.6275, 0.7529,
|
||||
0, 0.3765, 0.2510,
|
||||
0.5020, 0.3765, 0.2510,
|
||||
0, 0.8784, 0.2510,
|
||||
0.5020, 0.8784, 0.2510,
|
||||
0, 0.3765, 0.7529,
|
||||
0.5020, 0.3765, 0.7529,
|
||||
0, 0.8784, 0.7529,
|
||||
0.5020, 0.8784, 0.7529,
|
||||
0.2510, 0.3765, 0.2510,
|
||||
0.7529, 0.3765, 0.2510,
|
||||
0.2510, 0.8784, 0.2510,
|
||||
0.7529, 0.8784, 0.2510,
|
||||
0.2510, 0.3765, 0.7529,
|
||||
0.7529, 0.3765, 0.7529,
|
||||
0.2510, 0.8784, 0.7529,
|
||||
0.7529, 0.8784, 0.7529,
|
||||
0.1255, 0.1255, 0,
|
||||
0.6275, 0.1255, 0,
|
||||
0.1255, 0.6275, 0,
|
||||
0.6275, 0.6275, 0,
|
||||
0.1255, 0.1255, 0.5020,
|
||||
0.6275, 0.1255, 0.5020,
|
||||
0.1255, 0.6275, 0.5020,
|
||||
0.6275, 0.6275, 0.5020,
|
||||
0.3765, 0.1255, 0,
|
||||
0.8784, 0.1255, 0,
|
||||
0.3765, 0.6275, 0,
|
||||
0.8784, 0.6275, 0,
|
||||
0.3765, 0.1255, 0.5020,
|
||||
0.8784, 0.1255, 0.5020,
|
||||
0.3765, 0.6275, 0.5020,
|
||||
0.8784, 0.6275, 0.5020,
|
||||
0.1255, 0.3765, 0,
|
||||
0.6275, 0.3765, 0,
|
||||
0.1255, 0.8784, 0,
|
||||
0.6275, 0.8784, 0,
|
||||
0.1255, 0.3765, 0.5020,
|
||||
0.6275, 0.3765, 0.5020,
|
||||
0.1255, 0.8784, 0.5020,
|
||||
0.6275, 0.8784, 0.5020,
|
||||
0.3765, 0.3765, 0,
|
||||
0.8784, 0.3765, 0,
|
||||
0.3765, 0.8784, 0,
|
||||
0.8784, 0.8784, 0,
|
||||
0.3765, 0.3765, 0.5020,
|
||||
0.8784, 0.3765, 0.5020,
|
||||
0.3765, 0.8784, 0.5020,
|
||||
0.8784, 0.8784, 0.5020,
|
||||
0.1255, 0.1255, 0.2510,
|
||||
0.6275, 0.1255, 0.2510,
|
||||
0.1255, 0.6275, 0.2510,
|
||||
0.6275, 0.6275, 0.2510,
|
||||
0.1255, 0.1255, 0.7529,
|
||||
0.6275, 0.1255, 0.7529,
|
||||
0.1255, 0.6275, 0.7529,
|
||||
0.6275, 0.6275, 0.7529,
|
||||
0.3765, 0.1255, 0.2510,
|
||||
0.8784, 0.1255, 0.2510,
|
||||
0.3765, 0.6275, 0.2510,
|
||||
0.8784, 0.6275, 0.2510,
|
||||
0.3765, 0.1255, 0.7529,
|
||||
0.8784, 0.1255, 0.7529,
|
||||
0.3765, 0.6275, 0.7529,
|
||||
0.8784, 0.6275, 0.7529,
|
||||
0.1255, 0.3765, 0.2510,
|
||||
0.6275, 0.3765, 0.2510,
|
||||
0.1255, 0.8784, 0.2510,
|
||||
0.6275, 0.8784, 0.2510,
|
||||
0.1255, 0.3765, 0.7529,
|
||||
0.6275, 0.3765, 0.7529,
|
||||
0.1255, 0.8784, 0.7529,
|
||||
0.6275, 0.8784, 0.7529,
|
||||
0.3765, 0.3765, 0.2510,
|
||||
0.8784, 0.3765, 0.2510,
|
||||
0.3765, 0.8784, 0.2510,
|
||||
0.8784, 0.8784, 0.2510,
|
||||
0.3765, 0.3765, 0.7529,
|
||||
0.8784, 0.3765, 0.7529,
|
||||
0.3765, 0.8784, 0.7529,
|
||||
0.8784, 0.8784, 0.7529]
|
144
yolov7-tracker-example/tracker/trackeval/baselines/stp.py
Normal file
144
yolov7-tracker-example/tracker/trackeval/baselines/stp.py
Normal file
@ -0,0 +1,144 @@
|
||||
"""
|
||||
STP: Simplest Tracker Possible
|
||||
|
||||
Author: Jonathon Luiten
|
||||
|
||||
This simple tracker, simply assigns track IDs which maximise the 'bounding box IoU' between previous tracks and current
|
||||
detections. It is also able to match detections to tracks at more than one timestep previously.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import numpy as np
|
||||
from multiprocessing.pool import Pool
|
||||
from multiprocessing import freeze_support
|
||||
|
||||
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')))
|
||||
from trackeval.baselines import baseline_utils as butils
|
||||
from trackeval.utils import get_code_path
|
||||
|
||||
code_path = get_code_path()
|
||||
config = {
|
||||
'INPUT_FOL': os.path.join(code_path, 'data/detections/rob_mots/{split}/non_overlap_supplied/data/'),
|
||||
'OUTPUT_FOL': os.path.join(code_path, 'data/trackers/rob_mots/{split}/STP/data/'),
|
||||
'SPLIT': 'train', # valid: 'train', 'val', 'test'.
|
||||
'Benchmarks': None, # If None, all benchmarks in SPLIT.
|
||||
|
||||
'Num_Parallel_Cores': None, # If None, run without parallel.
|
||||
|
||||
'DETECTION_THRESHOLD': 0.5,
|
||||
'ASSOCIATION_THRESHOLD': 1e-10,
|
||||
'MAX_FRAMES_SKIP': 7
|
||||
}
|
||||
|
||||
|
||||
def track_sequence(seq_file):
|
||||
|
||||
# Load input data from file (e.g. provided detections)
|
||||
# data format: data['cls'][t] = {'ids', 'scores', 'im_hs', 'im_ws', 'mask_rles'}
|
||||
data = butils.load_seq(seq_file)
|
||||
|
||||
# Where to accumulate output data for writing out
|
||||
output_data = []
|
||||
|
||||
# To ensure IDs are unique per object across all classes.
|
||||
curr_max_id = 0
|
||||
|
||||
# Run tracker for each class.
|
||||
for cls, cls_data in data.items():
|
||||
|
||||
# Initialize container for holding previously tracked objects.
|
||||
prev = {'boxes': np.empty((0, 4)),
|
||||
'ids': np.array([], np.int),
|
||||
'timesteps': np.array([])}
|
||||
|
||||
# Run tracker for each timestep.
|
||||
for timestep, t_data in enumerate(cls_data):
|
||||
|
||||
# Threshold detections.
|
||||
t_data = butils.threshold(t_data, config['DETECTION_THRESHOLD'])
|
||||
|
||||
# Convert mask dets to bounding boxes.
|
||||
boxes = butils.masks2boxes(t_data['mask_rles'], t_data['im_hs'], t_data['im_ws'])
|
||||
|
||||
# Calculate IoU between previous and current frame dets.
|
||||
ious = butils.box_iou(prev['boxes'], boxes)
|
||||
|
||||
# Score which decreases quickly for previous dets depending on how many timesteps before they come from.
|
||||
prev_timestep_scores = np.power(10, -1 * prev['timesteps'])
|
||||
|
||||
# Matching score is such that it first tries to match 'most recent timesteps',
|
||||
# and within each timestep maximised IoU.
|
||||
match_scores = prev_timestep_scores[:, np.newaxis] * ious
|
||||
|
||||
# Find best matching between current dets and previous tracks.
|
||||
match_rows, match_cols = butils.match(match_scores)
|
||||
|
||||
# Remove matches that have an IoU below a certain threshold.
|
||||
actually_matched_mask = ious[match_rows, match_cols] > config['ASSOCIATION_THRESHOLD']
|
||||
match_rows = match_rows[actually_matched_mask]
|
||||
match_cols = match_cols[actually_matched_mask]
|
||||
|
||||
# Assign the prev track ID to the current dets if they were matched.
|
||||
ids = np.nan * np.ones((len(boxes),), np.int)
|
||||
ids[match_cols] = prev['ids'][match_rows]
|
||||
|
||||
# Create new track IDs for dets that were not matched to previous tracks.
|
||||
num_not_matched = len(ids) - len(match_cols)
|
||||
new_ids = np.arange(curr_max_id + 1, curr_max_id + num_not_matched + 1)
|
||||
ids[np.isnan(ids)] = new_ids
|
||||
|
||||
# Update maximum ID to ensure future added tracks have a unique ID value.
|
||||
curr_max_id += num_not_matched
|
||||
|
||||
# Drop tracks from 'previous tracks' if they have not been matched in the last MAX_FRAMES_SKIP frames.
|
||||
unmatched_rows = [i for i in range(len(prev['ids'])) if
|
||||
i not in match_rows and (prev['timesteps'][i] + 1 <= config['MAX_FRAMES_SKIP'])]
|
||||
|
||||
# Update the set of previous tracking results to include the newly tracked detections.
|
||||
prev['ids'] = np.concatenate((ids, prev['ids'][unmatched_rows]), axis=0)
|
||||
prev['boxes'] = np.concatenate((np.atleast_2d(boxes), np.atleast_2d(prev['boxes'][unmatched_rows])), axis=0)
|
||||
prev['timesteps'] = np.concatenate((np.zeros((len(ids),)), prev['timesteps'][unmatched_rows] + 1), axis=0)
|
||||
|
||||
# Save result in output format to write to file later.
|
||||
# Output Format = [timestep ID class score im_h im_w mask_RLE]
|
||||
for i in range(len(t_data['ids'])):
|
||||
row = [timestep, int(ids[i]), cls, t_data['scores'][i], t_data['im_hs'][i], t_data['im_ws'][i],
|
||||
t_data['mask_rles'][i]]
|
||||
output_data.append(row)
|
||||
|
||||
# Write results to file
|
||||
out_file = seq_file.replace(config['INPUT_FOL'].format(split=config['SPLIT']),
|
||||
config['OUTPUT_FOL'].format(split=config['SPLIT']))
|
||||
butils.write_seq(output_data, out_file)
|
||||
|
||||
print('DONE:', seq_file)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
# Required to fix bug in multiprocessing on windows.
|
||||
freeze_support()
|
||||
|
||||
# Obtain list of sequences to run tracker for.
|
||||
if config['Benchmarks']:
|
||||
benchmarks = config['Benchmarks']
|
||||
else:
|
||||
benchmarks = ['davis_unsupervised', 'kitti_mots', 'youtube_vis', 'ovis', 'bdd_mots', 'tao']
|
||||
if config['SPLIT'] != 'train':
|
||||
benchmarks += ['waymo', 'mots_challenge']
|
||||
seqs_todo = []
|
||||
for bench in benchmarks:
|
||||
bench_fol = os.path.join(config['INPUT_FOL'].format(split=config['SPLIT']), bench)
|
||||
seqs_todo += [os.path.join(bench_fol, seq) for seq in os.listdir(bench_fol)]
|
||||
|
||||
# Run in parallel
|
||||
if config['Num_Parallel_Cores']:
|
||||
with Pool(config['Num_Parallel_Cores']) as pool:
|
||||
results = pool.map(track_sequence, seqs_todo)
|
||||
|
||||
# Run in series
|
||||
else:
|
||||
for seq_todo in seqs_todo:
|
||||
track_sequence(seq_todo)
|
||||
|
@ -0,0 +1,92 @@
|
||||
"""
|
||||
Thresholder
|
||||
|
||||
Author: Jonathon Luiten
|
||||
|
||||
Simply reads in a set of detection, thresholds them at a certain score threshold, and writes them out again.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from multiprocessing.pool import Pool
|
||||
from multiprocessing import freeze_support
|
||||
|
||||
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')))
|
||||
from trackeval.baselines import baseline_utils as butils
|
||||
from trackeval.utils import get_code_path
|
||||
|
||||
THRESHOLD = 0.2
|
||||
|
||||
code_path = get_code_path()
|
||||
config = {
|
||||
'INPUT_FOL': os.path.join(code_path, 'data/detections/rob_mots/{split}/non_overlap_supplied/data/'),
|
||||
'OUTPUT_FOL': os.path.join(code_path, 'data/detections/rob_mots/{split}/threshold_' + str(100*THRESHOLD) + '/data/'),
|
||||
'SPLIT': 'train', # valid: 'train', 'val', 'test'.
|
||||
'Benchmarks': None, # If None, all benchmarks in SPLIT.
|
||||
|
||||
'Num_Parallel_Cores': None, # If None, run without parallel.
|
||||
|
||||
'DETECTION_THRESHOLD': THRESHOLD,
|
||||
}
|
||||
|
||||
|
||||
def do_sequence(seq_file):
|
||||
|
||||
# Load input data from file (e.g. provided detections)
|
||||
# data format: data['cls'][t] = {'ids', 'scores', 'im_hs', 'im_ws', 'mask_rles'}
|
||||
data = butils.load_seq(seq_file)
|
||||
|
||||
# Where to accumulate output data for writing out
|
||||
output_data = []
|
||||
|
||||
# Run for each class.
|
||||
for cls, cls_data in data.items():
|
||||
|
||||
# Run for each timestep.
|
||||
for timestep, t_data in enumerate(cls_data):
|
||||
|
||||
# Threshold detections.
|
||||
t_data = butils.threshold(t_data, config['DETECTION_THRESHOLD'])
|
||||
|
||||
# Save result in output format to write to file later.
|
||||
# Output Format = [timestep ID class score im_h im_w mask_RLE]
|
||||
for i in range(len(t_data['ids'])):
|
||||
row = [timestep, int(t_data['ids'][i]), cls, t_data['scores'][i], t_data['im_hs'][i],
|
||||
t_data['im_ws'][i], t_data['mask_rles'][i]]
|
||||
output_data.append(row)
|
||||
|
||||
# Write results to file
|
||||
out_file = seq_file.replace(config['INPUT_FOL'].format(split=config['SPLIT']),
|
||||
config['OUTPUT_FOL'].format(split=config['SPLIT']))
|
||||
butils.write_seq(output_data, out_file)
|
||||
|
||||
print('DONE:', seq_todo)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
# Required to fix bug in multiprocessing on windows.
|
||||
freeze_support()
|
||||
|
||||
# Obtain list of sequences to run tracker for.
|
||||
if config['Benchmarks']:
|
||||
benchmarks = config['Benchmarks']
|
||||
else:
|
||||
benchmarks = ['davis_unsupervised', 'kitti_mots', 'youtube_vis', 'ovis', 'bdd_mots', 'tao']
|
||||
if config['SPLIT'] != 'train':
|
||||
benchmarks += ['waymo', 'mots_challenge']
|
||||
seqs_todo = []
|
||||
for bench in benchmarks:
|
||||
bench_fol = os.path.join(config['INPUT_FOL'].format(split=config['SPLIT']), bench)
|
||||
seqs_todo += [os.path.join(bench_fol, seq) for seq in os.listdir(bench_fol)]
|
||||
|
||||
# Run in parallel
|
||||
if config['Num_Parallel_Cores']:
|
||||
with Pool(config['Num_Parallel_Cores']) as pool:
|
||||
results = pool.map(do_sequence, seqs_todo)
|
||||
|
||||
# Run in series
|
||||
else:
|
||||
for seq_todo in seqs_todo:
|
||||
do_sequence(seq_todo)
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user